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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Garoff, Henrik 

Liljestrom, Peter 

(ii) TITLE OF INVENTION: DNA Expression Systems Based on 
Alphaviruses 

(iii) NUMBER OF SEQUENCES: 27 

CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Birch, Stewart, Kolasch & Birch 

(B) STREET: P.O. Box 747 

(C) CITY: Falls Church 

(D) STATE: Virginia 

(E) COUNTRY: USA 

(F) ZIP: 22040-0747 

COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patent In Release #1.0, Version #1.25 

CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/920,281 

(B) FILING DATE: 13 -AUG- 1992 

(C) CLASSIFICATION: 

ATTORNEY/AGENT INFORMATION: 

(A) NAME: Murphy Jr., Gerald M. 

(B) REGISTRATION NUMBER: 28,977 

(C) REFERENCE/DOCKET NUMBER: 828-103P 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 703-241-1300 

(B) TELEFAX: 703-241-2848 

(C) TELEX: 248345 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11517 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 






Hviii) 



(ii) MOLECULE TYPE: RNA (genomic) 
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(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Semliki Forest Virus 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 1.. 11517 
(D) OTHER INFORMATION: /label= genome 

/note= "Semliki Forest Virus complete nucleotide 
sequence, presented as a cloned DNA sequence; see 
Figure 5 . 11 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

p (B) LOCATION: 87.. 7379 

(D) OTHER INFORMATION: /product= "SFV polyprotein" 

n < ix > FEATURE: 

fT (A) NAME /KEY : CDS 

(B) LOCATION: 7421.. 11179 

p (D) OTHER INFORMATION: /product= "SFV polyprotein" 

01 

5 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 

y 

(MTGGCGGAT GTGTGACATA CACGACGCCA AAAGATTTTG TTCCAGCTCC TGCCACCTCC 60 

CSTACGCGAG AGATTAACCA CCCACG ATG GCC GCC AAA GTG CAT GTT GAT ATT 113 
O Met Ala Ala Lys Val His Val Asp lie 

M 15 

GAG GCT GAC AGC CCA TTC ATC AAG TCT TTG CAG AAG GCA TTT CCG TCG 161 
Glu Ala Asp Ser Pro Phe lie Lys Ser Leu Gin Lys Ala Phe Pro Ser 
10 15 20 25 

TTC GAG GTG GAG TCA TTG CAG GTC ACA CCA AAT GAC CAT GCA AAT GCC 209 
Phe Glu Val Glu Ser Leu Gin Val Thr Pro Asn Asp His Ala Asn Ala 

30 35 40 

AGA GCA TTT TCG CAC CTG GCT ACC AAA TTG ATC GAG CAG GAG ACT GAC 257 
Arg Ala Phe Ser His Leu Ala Thr Lys Leu He Glu Gin Glu Thr Asp 
45 50 55 



AAA GAC ACA CTC ATC TTG GAT ATC GGC AGT GCG CCT TCC AGG AGA ATG 
Lys Asp Thr Leu He Leu Asp He Gly Ser Ala Pro Ser Arg Arg Met 
60 65 70 



305 
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ATG TCT ACG CAC AAA TAC CAC TGC GTA TGC CCT ATG CGC AGC GCA GAA 353 
Met Ser Thr His Lys Tyr His Cys Val Cys Pro Met Arg Ser Ala Glu 
75 80 85 

GAC CCC GAA AGG CTC GAT AGC TAC GCA AAG AAA CTG GCA GCG GCC TCC 401 
Asp Pro Glu Arg Leu Asp Ser Tyr Ala Lys Lys Leu Ala Ala Ala Ser 
- 90 '95 100 105 

GGG AAG GTG CTG GAT AGA GAG ATC GCA GGA AAA ATC ACC GAC CTG CAG 44 9 

Gly Lys Val Leu Asp Arg Glu lie Ala Gly Lys lie Thr Asp Leu Gin 

110 115 120 

ACC GTC ATG GCT ACG CCA GAC GCT GAA TCT CCT ACC TTT TGC CTG CAT 497 
Thr Val Met Ala Thr Pro Asp Ala Glu Ser Pro Thr Phe Cys Leu His 
125 130 135 

ACA GAC GTC ACG TGT CGT ACG GCA GCC GAA GTG GCC GTA TAC CAG GAC 545 

ghr Asp Val Thr Cys Arg Thr Ala Ala Glu Val Ala Val Tyr Gin Asp 
\Q 140 145 150 

f<§TG TAT GCT GTA CAT GCA CCA ACA TCG CTG TAC CAT CAG GCG ATG AAA 593 

|Val Tyr Ala Val His Ala Pro Thr Ser Leu Tyr His Gin Ala Met Lys 
s U 155 160 165 

^GT GTC AGA ACG GCG TAT TGG ATT GGG TTT GAC ACC ACC CCG TTT ATG 641 

*T3ly Val Arg Thr Ala Tyr Trp lie Gly Phe Asp Thr Thr Pro Phe Met 

!l70 175 180 185 

HTT GAC GCG CTA GCA GGC GCG TAT CCA ACC TAC GCC ACA AAC TGG GCC 68 9 

ghe Asp Ala Leu Ala Gly Ala Tyr Pro Thr Tyr Ala Thr Asn Trp Ala 
U 190 195 200 

c F '"| 

|=©AC GAG CAG GTG TTA CAG GCC AGG AAC ATA GGA CTG TGT GCA GCA TCC 737 
Asp Glu Gin Val Leu Gin Ala Arg Asn lie Gly Leu Cys Ala Ala Ser 
205 210 215 

TTG ACT GAG GGA AGA CTC GGC AAA CTG TCC ATT CTC CGC AAG AAG CAA 78 5 

Leu Thr Glu Gly Arg Leu Gly Lys Leu Ser lie Leu Arg Lys Lys Gin 
220 225 230 

TTG AAA CCT TGC GAC ACA GTC ATG TTC TCG GTA GGA TCT ACA TTG TAC 83 3 

Leu Lys Pro Cys Asp Thr Val Met Phe Ser Val Gly Ser Thr Leu Tyr 
235 240 245 

ACT GAG AGC AGA AAG CTA CTG AGG AGC TGG CAC TTA CCC TCC GTA TTC 881 
Thr Glu Ser Arg Lys Leu Leu Arg Ser Trp His Leu Pro Ser Val Phe 
250 255 260 265 

CAC CTG AAA GGT AAA CAA TCC TTT ACC TGT AGG TGC GAT ACC ATC GTA 92 9 

His Leu Lys Gly Lys Gin Ser Phe Thr Cys Arg Cys Asp Thr lie Val 

270 275 280 
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TCA 


TGT 


GAA 


GGG 


TAC 


GTA 


GTT 


AAG 


AAA 


ATC 


ACT 


ATG 


TGC 


CCC 


GGC 


CTG 


977 


Ser 


Cys 


Glu 


Gly 


Tyr 


Val 


Val 


Lys 


Lys 


He 


Thr 


Met 


Cys 


Pro 


Gly 


Leu 










285 










290 








295 






TAC 


GGT 


AAA 


ACG 


GTA 


GGG 


TAC 


GCC 


GTG 


ACG 


TAT 


CAC 


GCG 


GAG 


GGA 


TTC 


1025 


Tyr 


Gly 


Lys 


Thr 


Val 


Gly 


Tyr 


Ala 


Val 


Thr 


Tyr 


His 


Ala 


Glu 


Gly 


Phe 








300 










305 










310 








CTA 


GTG 


TGC 


AAG 


ACC 


ACA 


GAC 


ACT 


GTC 


AAA 


GGA 


GAA 


AGA 


GTC 


TCA 


TTC 


1073 


Leu 


Val 


Cys 


Lys 


Thr 


Thr 


Asp 


Thr 


Val 


Lys 


Gly 


Glu 


Arg 


Val 


Ser 


Phe 






O X J 


















"3 O C 

3 2b 










CCT 


GTA 


TGC 


ACC 


TAC 


GTC 


ccc 


TCA 


ACC 


ATC 


TGT 


GAT 


CAA 


ATG 


ACT 


GGC 


1121 


Pro 


Val 


Cys 


Thr 


Tyr 


Val 


Pro 


Ser 


Thr 


He 


Cys 


Asp 


Gin 


Met 


Thr 


Gly 




330 










335 










340 








345 




ATA 


CTA 


GCG 


ACC 


GAC 


GTC 


ACA 


CCG 


GAG 


GAC 


GCA 


CAG 


AAG 


TTG 

X X \J 


TTA 

x x n 


GTG 




lie 

5*1 


Leu 


Ala 


Thr 


Asp 


Val 


Thr 


Pro 


Glu 


Asp 


Ala 


Gin 


Lys 


Leu 


Leu 


Val 












350 










355 








360 






GGA 


TTG 


AAT 


CAG 


AGG 


ATA 


GTT 


GTG 


AAC 


GGA 


AGA 


ACA 


CAG 


CGA 


AAC 


ACT 


1217 


(Hy 


Leu 


Asn 


Gin 


Arg 


He 


Val 


Val 


Asn 


Gly 


Arg 


Thr 


Gin 


Arg 


Asn 


Thr 




5 

jj. 






365 










370 










375 








AUc 


ACG 


ATG 


AAG 


AAC 


TAT 


CTG 


CTT 


CCG 


ATT 


GTG 


GCC 


GTC 


GCA 


TTT 


AGC 


1265 




Thr 


Met 


Lys 


Asn 


Tyr 


Leu 


Leu 


Pro 


He 


Val 


Ala 


Val 


Ala 


Phe 


Ser 








380 










385 










390 










Q 

AAG 


TGG 


GCG 


AGG 


GAA 


TAC 


AAG 


GCA 


GAC 


CTT 


GAT 


GAT 


GAA 


AAA 


CCT 


CTG 


1313 




Trp 


Ala 


Arg 


Glu 


Tyr 


Lys 


Ala 


Asp 


Leu 


Asp 


Asp 


Glu 


Lys 


Pro 


Leu 






395 










400 










405 












GTC 


CGA 


GAG 


AGG 


TCA 


CTT 


ACT 


TGC 


TGC 


TGC 


TTG 


TGG 


GCA 


TTT 

XXX 


AAA 


x. J O x. 


Gly 


Val 


Arg 


Glu 


Arg 


Ser 


Leu 


Thr 


Cys 


Cys 


Cys 


Leu 


Trp 


Ala 


Phe 


Lys 




410 










415 










420 










425 




ACG 


AGG 


AAG 


ATG 


CAC 


ACC 


ATG 


TAC 


AAG 


AAA 


CCA 


GAC 


ACC 


CAG 


ACA 


ATA 


14 09 


Thr 


Arg 


Lys 


Met 


His 


Thr 


Met 


Tyr 


Lys 


Lys 


Pro 


Asp 


Thr 


Gin 


Thr 


He 












430 










435 










440 






GTG 


AAG 


GTG 


CCT 


TCA 


GAG 


TTT 


AAC 


TCG 


TTC 


GTC 


ATC 


CCG 


AGC 


CTA 


TGG 


1457 


Val 


Lys 


Val 


Pro 


Ser 


Glu 


Phe 


Asn 


Ser 


Phe 


Val 


He 




Cpy- 
OCl 


T .f=M 1 
JJC LA 


ir P 










445 










450 










455 






TCT 


ACA 


GGC 


CTC 


GCA 


ATC 


CCA 


GTC 


AGA 


TCA 


CGC 


ATT 


AAG 


ATG 


CTT 


TTG 


1505 


Ser 


Thr 


Gly 


Leu 


Ala 


He 


Pro 


Val 


Arg 


Ser 


Arg 


lie 


Lys 


Met 


Leu 


Leu 








460 










465 










470 










GCC 


AAG 


AAG 


ACC 


AAG 


CGA 


GAG 


TTA 


ATA 


CCT 


GTT 


CTC 


GAC 


GCG 


TCG 


TCA 


1553 


Ala 


Lys 


Lys 


Thr 


Lys 


Arg 


Glu 


Leu 


He 


Pro 


Val 


Leu 


Asp 


Ala 


Ser 


Ser 






475 










480 










485 
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GCC AGG GAT GCT GAA CAA GAG GAG AAG GAG AGG TTG GAG GCC GAG CTG 1601 

Ala Arg Asp Ala Glu Gin Glu Glu Lys Glu Arg Leu Glu Ala Glu Leu 

490 495 500 505 

ACT AGA GAA GCC TTA CCA CCC CTC GTC CCC ATC GCG CCG GCG GAG ACG 164 9 

Thr Arg Glu Ala Leu Pro Pro Leu Val Pro lie Ala Pro Ala Glu Thr 

510 * 515 520 

GGA GTC GTC GAC GTC GAC GTT GAA GAA CTA GAG TAT CAC GCA GGT GCA 1697 
Gly Val Val Asp Val Asp Val Glu Glu Leu Glu Tyr His Ala Gly Ala 
525 530 535 

\GG GTC GTG GAA ACA CCT CGC AGC GCG TTG AAA GTC ACC GCA CAG CCG 1745 
r Gly Val Val Glu Thr Pro Arg Ser Ala Leu Lys Val Thr Ala Gin Pro 
540 545 550 

AAC GAC GTA CTA CTA GGA AAT TAC GTA GTT CTG TCC CCG CAG ACC GTG 17 93 

Asn Asp Val Leu Leu Gly Asn Tyr Val Val Leu Ser Pro Gin Thr Val 
y 555 560 565 

Ojc AAG AGC TCC AAG TTG GCC CCC GTG CAC CCT CTA GCA GAG CAG GTG 1841 
lf3u Lys Ser Ser Lys Leu Ala Pro Val His Pro Leu Ala Glu Gin Val 
540 575 580 585 

i®A ATA ATA ACA CAT AAC GGG AGG GCC GGC GGT TAC CAG GTC GAC GGA 18 8 9 

Bps lie lie Thr His Asn Gly Arg Ala Gly Gly Tyr Gin Val Asp Gly 
B 590 595 600 

tSt GAC GGC AGG GTC CTA CTA CCA TGT GGA TCG GCC ATT CCG GTC CCT 193 7 

Tj&r Asp Gly Arg Val Leu Leu Pro Cys Gly Ser Ala lie Pro Val Pro 
q 605 610 615 

(Sg TTT CAA GCT TTG AGC GAG AGC GCC ACT ATG GTG TAC AAC GAA AGG 1985 
(flu Phe Gin Ala Leu Ser Glu Ser Ala Thr Met Val Tyr Asn Glu Arg 
620 625 630 

GAG TTC GTC AAC AGG AAA CTA TAC CAT ATT GCC GTT CAC GGA CCG TCG 2033 
Glu Phe Val Asn Arg Lys Leu Tyr His lie Ala Val His Gly Pro Ser 
635 640 645 

CTG AAC ACC GAC GAG GAG AAC TAC GAG AAA GTC AGA GCT GAA AGA ACT 2081 
Leu Asn Thr Asp Glu Glu Asn Tyr Glu Lys Val Arg Ala Glu Arg Thr 
650 655 660 665 

GAC GCC GAG TAC GTG TTC GAC GTA GAT AAA AAA TGC TGC GTC AAG AGA 212 9 

Asp Ala Glu Tyr Val Phe Asp Val Asp Lys Lys Cys Cys Val Lys Arg 

670 675 680 

GAG GAA GCG TCG GGT TTG GTG TTG GTG GGA GAG CTA ACC AAC CCC CCG 2177 
Glu Glu Ala Ser Gly Leu Val Leu Val Gly Glu Leu Thr Asn Pro Pro 
685 690 695 
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TTC CAT GAA TTC GCC TAC GAA GGG CTG AAG ATC AGG CCG TCG GCA CCA 2225 
Phe His Glu Phe Ala Tyr Glu Gly Leu Lys lie Arg Pro Ser Ala Pro 
700 705 710 

TAT AAG ACT ACA GTA GTA GGA GTC TTT GGG GTT CCG GGA TCA GGC AAG 2273 
Tyr Lys Thr Thr Val Val Gly Val Phe Gly Val Pro Gly Ser Gly Lys 
715 720 725 

TCT GCT ATT ATT AAG AGC CTC GTG ACC AAA CAC GAT CTG GTC ACC AGC 2321 
Ser Ala lie lie Lys Ser Leu Val Thr Lys His Asp Leu Val Thr Ser 
730 735 740 745 

GGC AAG AAG GAG AAC TGC CAG GAA ATA GTT AAC GAC GTG AAG AAG CAC 2369 
Gly Lys Lys Glu Asn Cys Gin Glu lie Val Asn Asp Val Lys Lys His 

750 755 760 

CGC GGG AAG GGG ACA AGT AGG GAA AAC AGT GAC TCC ATC CTG CTA AAC 2417 
Arg Gly Lys Gly Thr Ser Arg Glu Asn Ser Asp Ser lie Leu Leu Asn 
O 765 770 775 

&GG TGT CGT CGT GCC GTG GAC ATC CTA TAT GTG GAC GAG GCT TTC GCT 2465 
Sly Cys Arg Arg Ala Val Asp lie Leu Tyr Val Asp Glu Ala Phe Ala 
780 785 790 

fflGC CAT TCC GGT ACT CTG CTG GCC CTA ATT GCT CTT GTT AAA CCT CGG 2513 

f@ys His Ser Gly Thr Leu Leu Ala Leu lie Ala Leu Val Lys Pro Arg 

~" 795 800 805 

Sbc AAA GTG GTG TTA TGC GGA GAC CCC AAG CAA TGC GGA TTC TTC AAT 2 561 

E §er Lys Val Val Leu Cys Gly Asp Pro Lys Gin Cys Gly Phe Phe Asn 
^10 815 820 825 

STG ATG CAG CTT AAG GTG AAC TTC AAC CAC AAC ATC TGC ACT GAA GTA 2 609 

Mlet Met Gin Leu Lys Val Asn Phe Asn His Asn lie Cys Thr Glu Val 

830 835 840 

TGT CAT AAA AGT ATA TCC AGA CGT TGC ACG CGT CCA GTC ACG GCC ATC 2657 
Cys His Lys Ser lie Ser Arg Arg Cys Thr Arg Pro Val Thr Ala lie 
845 850 855 

GTG TCT ACG TTG CAC TAC GGA GGC AAG ATG CGC ACG ACC AAC CCG TGC 2 705 

Val Ser Thr Leu His Tyr Gly Gly Lys Met Arg Thr Thr Asn Pro Cys 
860 865 870 

AAC AAA CCC ATA ATC ATA GAC ACC ACA GGA CAG ACC AAG CCC AAG CCA 2753 
Asn Lys Pro lie lie lie Asp Thr Thr Gly Gin Thr Lys Pro Lys Pro 
875 880 885 

GGA GAC ATC GTG TTA ACA TGC TTC CGA GGC TGG GCA AAG CAG CTG CAG 2801 
Gly Asp lie Val Leu Thr Cys Phe Arg Gly Trp Ala Lys Gin Leu Gin 
890 895 900 905 



# 
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TTG GAC TAC CGT GGA CAC GAA GTC ATG ACA GCA GCA GCA TCT CAG GGC 284 9 

Leu Asp Tyr Arg Gly His Glu Val Met Thr Ala Ala Ala Ser Gin Gly 

910 915 920 

CTC ACC CGC AAA GGG GTA TAC GCC GTA AGG CAG AAG GTG AAT GAA AAT 2 897 

Leu Thr Arg Lys Gly Val Tyr Ala Val Arg Gin Lys Val Asn Glu Asn 
925 930 935 

CCC TTG TAT GCC CCT GCG TCG GAG CAC GTG AAT GTA CTG CTG ACG CGC 2 94 5 

Pro Leu Tyr Ala Pro Ala Ser Glu His Val Asn Val Leu Leu Thr Arg 
940 945 950 

ACT GAG GAT AGG CTG GTG TGG AAA ACG CTG GCC GGC GAT CCC TGG ATT 2 993 

Thr Glu Asp Arg Leu Val Trp Lys Thr Leu Ala Gly Asp Pro Trp lie 
955 960 965 

AAG GTC CTA TCA AAC ATT CCA CAG GGT AAC TTT ACG GCC ACA TTG GAA 3 041 

L^s Val Leu Ser Asn lie Pro Gin Gly Asn Phe Thr Ala Thr Leu Glu 
940 975 980 985 

QUA TGG CAA GAA GAA CAC GAC AAA ATA ATG AAG GTG ATT GAA GGA CCG 3 08 9 

(flu Trp Gin Glu Glu His Asp Lys lie Met Lys Val lie Glu Gly Pro 
U 990 995 1000 

(®T GCG CCT GTG GAC GCG TTC CAG AAC AAA GCG AAC GTG TGT TGG GCG 3137 
Alja Ala Pro Val Asp Ala Phe Gin Asn Lys Ala Asn Val Cys Trp Ala 
~ 1005 1010 1015 

AAA AGC CTG GTG CCT GTC CTG GAC ACT GCC GGA ATC AGA TTG ACA GCA 3185 
Lys Ser Leu Val Pro Val Leu Asp Thr Ala Gly lie Arg Leu Thr Ala 

Li 1020 1025 1030 

M 

(§G GAG TGG AGC ACC ATA ATT ACA GCA TTT AAG GAG GAC AGA GCT TAC 3233 
5lu Glu Trp Ser Thr lie lie Thr Ala Phe Lys Glu Asp Arg Ala Tyr 
1035 1040 1045 

TCT CCA GTG GTG GCC TTG AAT GAA ATT TGC ACC AAG TAC TAT GGA GTT 32 81 

Ser Pro Val Val Ala Leu Asn Glu lie Cys Thr Lys Tyr Tyr Gly Val 
1050 1055 1060 1065 

GAC CTG GAC AGT GGC CTG TTT TCT GCC CCG AAG GTG TCC CTG TAT TAC 3329 
Asp Leu Asp Ser Gly Leu Phe Ser Ala Pro Lys Val Ser Leu Tyr Tyr 

1070 1075 1080 

GAG AAC AAC CAC TGG GAT AAC AGA CCT GGT GGA AGG ATG TAT GGA TTC 33 77 

Glu Asn Asn His Trp Asp Asn Arg Pro Gly Gly Arg Met Tyr Gly Phe 
1085 1090 1095 

AAT GCC GCA ACA GCT GCC AGG CTG GAA GCT AGA CAT ACC TTC CTG AAG 3425 
Asn Ala Ala Thr Ala Ala Arg Leu Glu Ala Arg His Thr Phe Leu Lys 
1100 1105 1110 



# 
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GGG CAG TGG CAT ACG GGC AAG CAG GCA GTT ATC GCA GAA AGA AAA ATC 3473 
Gly Gin Trp His Thr Gly Lys Gin Ala Val lie Ala Glu Arg Lys lie 
1115 1120 1125 

CAA CCG CTT TCT GTG CTG GAC AAT GTA ATT CCT ATC AAC CGC AGG CTG 3 521 

Gin Pro Leu Ser Val Leu Asp Asn Val lie Pro lie Asn Arg Arg Leu 
1130 il35 1140 1145 

CCG CAC GCC CTG GTG GCT GAG TAC AAG ACG GTT AAA GGC AGT AGG GTT 3 569 

Pro His Ala Leu Val Ala Glu Tyr Lys Thr Val Lys Gly Ser Arg Val 

1150 1155 1160 

GAG TGG CTG GTC AAT AAA GTA AGA GGG TAC CAC GTC CTG CTG GTG AGT 3 617 

Glu Trp Leu Val Asn Lys Val Arg Gly Tyr His Val Leu Leu Val Ser 
1165 1170 1175 

GAG TAC AAC CTG GCT TTG CCT CGA CGC AGG GTC ACT TGG TTG TCA CCG 3665 
Qiu Tyr Asn Leu Ala Leu Pro Arg Arg Arg Val Thr Trp Leu Ser Pro 
^ 1180 1185 1190 

(Sg AAT GTC ACA GGC GCC GAT AGG TGC TAC GAC CTA AGT TTA GGA CTG 3 713 

Ifiu Asn Val Thr Gly Ala Asp Arg Cys Tyr Asp Leu Ser Leu Gly Leu 
1195 1200 1205 

r~ 

(Hg GCT GAC GCC GGC AGG TTC GAC TTG GTC TTT GTG AAC ATT CAC ACG 3761 
Mo Ala Asp Ala Gly Arg Phe Asp Leu Val Phe Val Asn lie His Thr 
S210 1215 1220 1225 

Q 

GAA TTC AGA ATC CAC CAC TAC CAG CAG TGT GTC GAC CAC GCC ATG AAG 3 809 

Giu Phe Arg lie His His Tyr Gin Gin Cys Val Asp His Ala Met Lys 
p 1230 1235 1240 

(S"G CAG ATG CTT GGG GGA GAT GCG CTA CGA CTG CTA AAA CCC GGC GGC 3 857 

Leu Gin Met Leu Gly Gly Asp Ala Leu Arg Leu Leu Lys Pro Gly Gly 
1245 1250 1255 

ATC TTG ATG AGA GCT TAC GGA TAC GCC GAT AAA ATC AGC GAA GCC GTT 3905 
lie Leu Met Arg Ala Tyr Gly Tyr Ala Asp Lys lie Ser Glu Ala Val 
1260 1265 1270 

GTT TCC TCC TTA AGC AGA AAG TTC TCG TCT GCA AGA GTG TTG CGC CCG 3953 
Val Ser Ser Leu Ser Arg Lys Phe Ser Ser Ala Arg Val Leu Arg Pro 
1275 1280 1285 

GAT TGT GTC ACC AGC AAT ACA GAA GTG TTC TTG CTG TTC TCC AAC TTT 4 001 

Asp Cys Val Thr Ser Asn Thr Glu Val Phe Leu Leu Phe Ser Asn Phe 
1290 1295 1300 1305 

GAC AAC GGA AAG AGA CCC TCT ACG CTA CAC CAG ATG AAT ACC AAG CTG 4049 
Asp Asn Gly Lys Arg Pro Ser Thr Leu His Gin Met Asn Thr Lys Leu 

1310 1315 1320 
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AGT GCC GTG TAT GCC GGA GAA GCC ATG CAC ACG GCC GGG TGT GCA CCA 4097 
Ser Ala Val Tyr Ala Gly Glu Ala Met His Thr Ala Gly Cys Ala Pro 
1325 1330 1335 

TCC TAC AGA GTT AAG AGA GCA GAC ATA GCC ACG TGC ACA GAA GCG GCT 414 5 

Ser Tyr Arg Val Lys Arg Ala Asp lie Ala Thr Cys Thr Glu Ala Ala 
1340 1345 1350 

GTG GTT AAC GCA GCT AAC GCC CGT GGA ACT GTA GGG GAT GGC GTA TGC 4193 
Val Val Asn Ala Ala Asn Ala Arg Gly Thr Val Gly Asp Gly Val Cys 
1355 1360 1365 

AGG GCC GTG GCG AAG AAA TGG CCG TCA GCC TTT AAG GGA GCA GCA ACA 4241 
Arg Ala Val Ala Lys Lys Trp Pro Ser Ala Phe Lys Gly Ala Ala Thr 
1370 1375 1380 1385 

CCA GTG GGC ACA ATT AAA ACA GTC ATG TGC GGC TCG TAC CCC GTC ATC 428 9 

Pro Val Gly Thr lie Lys Thr Val Met Cys Gly Ser Tyr Pro Val lie 
P 1390 1395 1400 

QAC GCT GTA GCG CCT AAT TTC TCT GCC ACG ACT GAA GCG GAA GGG GAC 43 3 7 

igs Ala Val Ala Pro Asn Phe Ser Ala Thr Thr Glu Ala Glu Gly Asp 
y> 1405 1410 1415 

(JqC GAA TTG GCC GCT GTC TAC CGG GCA GTG GCC GCC GAA GTA AAC AGA 4 385 

Glu Leu Ala Ala Val Tyr Arg Ala Val Ala Ala Glu Val Asn Arg 

1420 1425 1430 



yr 



G?G TCA CTG AGC AGC GTA GCC ATC CCG CTG CTG TCC ACA GGA GTG TTC 443 3 

JMu Ser Leu Ser Ser Val Ala lie Pro Leu Leu Ser Thr Gly Val Phe 
1435 1440 1445 

ASC GGC GGA AGA GAT AGG CTG CAG CAA TCC CTC AAC CAT CTA TTC ACA 4481 
$er Gly Gly Arg Asp Arg Leu Gin Gin Ser Leu Asn His Leu Phe Thr 
1450 1455 1460 1465 

GCA ATG GAC GCC ACG GAC GCT GAC GTG ACC ATC TAC TGC AGA GAC AAA 4529 
Ala Met Asp Ala Thr Asp Ala Asp Val Thr lie Tyr Cys Arg Asp Lys 

1470 1475 1480 

AGT TGG GAG AAG AAA ATC CAG GAA GCC ATT GAC ATG AGG ACG GCT GTG 4577 
Ser Trp Glu Lys Lys lie Gin Glu Ala lie Asp Met Arg Thr Ala Val 
1485 1490 1495 

GAG TTG CTC AAT GAT GAC GTG GAG CTG ACC ACA GAC TTG GTG AGA GTG 46 2 5 

Glu Leu Leu Asn Asp Asp Val Glu Leu Thr Thr Asp Leu Val Arg Val 
1500 1505 1510 

CAC CCG GAC AGC AGC CTG GTG GGT CGT AAG GGC TAC AGT ACC ACT GAC 46 73 

His Pro Asp Ser Ser Leu Val Gly Arg Lys Gly Tyr Ser Thr Thr Asp 
1515 1520 1525 
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GGG TCG CTG TAC TCG TAC TTT GAA GGT ACG AAA TTC AAC CAG GCT GCT 4721 
Gly Ser Leu Tyr Ser Tyr Phe Glu Gly Thr Lys Phe Asn Gin Ala Ala 
1530 1535 1540 1545 

ATT GAT ATG GCA GAG ATA CTG ACG TTG TGG CCC AGA CTG CAA GAG GCA 476 9 

lie Asp Met Ala Glu lie Leu Thr Leu Trp Pro Arg Leu Gin Glu Ala 

1550* . 1555 1560 

AAC GAA CAG ATA TGC CTA TAC GCG CTG GGC GAA ACA ATG GAC AAC ATC 4 817 

Asn Glu Gin lie Cys Leu Tyr Ala Leu Gly Glu Thr Met Asp Asn lie 
1565 1570 1575 

AGA TCC AAA TGT CCG GTG AAC GAT TCC GAT TCA TCA ACA CCT CCC AGG 4 865 

Arg Ser Lys Cys Pro Val Asn Asp Ser Asp Ser Ser Thr Pro Pro Arg 
1580 1585 1590 

ACA GTG CCC TGC CTG TGC CGC TAC GCA ATG ACA GCA GAA CGG ATC GCC 4 913 

Thr Val Pro Cys Leu Cys Arg Tyr Ala Met Thr Ala Glu Arg lie Ala 
Q 1595 1600 1605 

@C CTT AGG TCA CAC CAA GTT AAA AGC ATG GTG GTT TGC TCA TCT TTT 4 961 

fifg Leu Arg Ser His Gin Val Lys Ser Met Val Val Cys Ser Ser Phe 

IL610 1615 1620 1625 

Q€C CTC CCG AAA TAC CAT GTA GAT GGG GTG CAG AAG GTA AAG TGC GAG 5009 

ggfo Leu Pro Lys Tyr His Val Asp Gly Val Gin Lys Val Lys Cys Glu 

1630 1635 1640 

E 



G GTT CTC CTG TTC GAC CCG ACG GTA CCT TCA GTG GTT AGT CCG CGG 5057 
€f s Val Leu Leu Phe Asp Pro Thr Val Pro Ser Val Val Ser Pro Arg 
1645 1650 1655 

AAG TAT GCC GCA TCT ACG ACG GAC CAC TCA GAT CGG TCG TTA CGA GGG 5105 
ihfs Tyr Ala Ala Ser Thr Thr Asp His Ser Asp Arg Ser Leu Arg Gly 
1660 1665 1670 

TTT GAC TTG GAC TGG ACC ACC GAC TCG TCT TCC ACT GCC AGC GAT ACC 5153 
Phe Asp Leu Asp Trp Thr Thr Asp Ser Ser Ser Thr Ala Ser Asp Thr 
1675 1680 1685 

ATG TCG CTA CCC AGT TTG CAG TCG TGT GAC ATC GAC TCG ATC TAC GAG 5201 
Met Ser Leu Pro Ser Leu Gin Ser Cys Asp lie Asp Ser lie Tyr Glu 
1690 1695 1700 1705 

CCA ATG GCT CCC ATA GTA GTG ACG GCT GAC GTA CAC CCT GAA CCC GCA 524 9 

Pro Met Ala Pro lie Val Val Thr Ala Asp Val His Pro Glu Pro Ala 

1710 1715 1720 

GGC ATC GCG GAC CTG GCG GCA GAT GTG CAC CCT GAA CCC GCA GAC CAT 52 97 

Gly lie Ala Asp Leu Ala Ala Asp Val His Pro Glu Pro Ala Asp His 
1725 1730 1735 
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GTG GAC CTC GAG AAC CCG ATT CCT CCA CCG CGC CCG AAG AGA GCT GCA 5345 

Val Asp Leu Glu Asn Pro lie Pro Pro Pro Arg Pro Lys Arg Ala Ala 
1740 1745 1750 

TAC CTT GCC TCC CGC GCG GCG GAG CGA CCG GTG CCG GCG CCG AGA AAG 53 93 

Tyr Leu Ala Ser Arg Ala Ala Glu Arg Pro Val Pro Ala Pro Arg Lys 
1755 * 1760 1765 

CCG ACG CCT GCC CCA AGG ACT GCG TTT AGG AAC AAG CTG CCT TTG ACG 5441 
Pro Thr Pro Ala Pro Arg Thr Ala Phe Arg Asn Lys Leu Pro Leu Thr 
1770 1775 1780 1785 

TTC GGC GAC TTT GAC GAG CAC GAG GTC GAT GCG TTG GCC TCC GGG ATT 5489 
Phe Gly Asp Phe Asp Glu His Glu Val Asp Ala Leu Ala Ser Gly lie 

1790 1795 1800 

ACT TTC GGA GAC TTC GAC GAC GTC CTG CGA CTA GGC CGC GCG GGT GCA 553 7 

Thr Phe Gly Asp Phe Asp Asp Val Leu Arg Leu Gly Arg Ala Gly Ala 
Q 1805 1810 1815 



^T ATT TTC TCC TCG GAC ACT GGC AGC GGA CAT TTA CAA CAA AAA TCC 5585 
rr lie Phe Ser Ser Asp Thr Gly Ser Gly His Leu Gin Gin Lys Ser 
Ljl 1820 1825 1830 

Ljl 

STT AGG CAG CAC AAT CTC CAG TGC GCA CAA CTG GAT GCG GTC CAG GAG 563 3 

Sal Arg Gin His Asn Leu Gin Cys Ala Gin Leu Asp Ala Val Gin Glu 
ya 1835 1840 1845 

fejAG AAA ATG TAC CCG CCA AAA TTG GAT ACT GAG AGG GAG AAG CTG TTG 56 81 

Glu Lys Met Tyr Pro Pro Lys Leu Asp Thr Glu Arg Glu Lys Leu Leu 
¥850 1855 1860 1865 

I 3 

OTG CTG AAA ATG CAG ATG CAC CCA TCG GAG GCT AAT AAG AGT CGA TAC 572 9 

feeu Leu Lys Met Gin Met His Pro Ser Glu Ala Asn Lys Ser Arg Tyr 

1870 1875 1880 

CAG TCT CGC AAA GTG GAG AAC ATG AAA GCC ACG GTG GTG GAC AGG CTC 5777 
Gin Ser Arg Lys Val Glu Asn Met Lys Ala Thr Val Val Asp Arg Leu 
1885 1890 1895 

ACA TCG GGG GCC AGA TTG TAC ACG GGA GQG GAC GTA GGC CGC ATA CCA 5825 
Thr Ser Gly Ala Arg Leu Tyr Thr Gly Ala Asp Val Gly Arg lie Pro 
1900 1905 1910 

ACA TAC GCG GTT CGG TAC CCC CGC CCC GTG TAC TCC CCT ACC GTG ATC 5873 
Thr Tyr Ala Val Arg Tyr Pro Arg Pro Val Tyr Ser Pro Thr Val lie 
1915 1920 1925 

GAA AGA TTC TCA AGC CCC GAT GTA GCA ATC GCA GCG TGC AAC GAA TAC 5921 
Glu Arg Phe Ser Ser Pro Asp Val Ala lie Ala Ala Cys Asn Glu Tyr 
1930 1935 1940 1945 



# 
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CTA 


TCC 


AGA 


AAT 


TAC 


CCA 


ACA 


GTG 


GCG 


TCG 


TAC 


CAG 


ATA 


ACA 


GAT 


GAA 


5969 


Leu 


Ser 


Arg 


Asn 


Tyr 


Pro 


Thr 


Val 


Ala 


Ser 


Tyr 


Gin 


He 


Thr 


Asp 


Glu 












1950 








1955 








1960 




TAC 


GAC 


GCA 


TAC 


TTG 


GAC 


ATG 


GTT 


GAC 


GGG 


TCG 


GAT 


AGT 


TGC 


TTG 


GAC 


6017 


Tyr 


Asp 


Ala 


Tyr 


Leu 


Asp 


Met 


Val Asp Gly Ser Asp 


Ser 


Cys 


Leu 


Asp 










1965 






1970 








1975 






AGA 


GCG 


ACA 


TTC 


TGC 


CCG 


GCG 


AAG 


CTC 


CGG 


TGC 


TAC 


CCG 


AAA 


CAT 


CAT 


6065 


Arg 


Ala 


Thr 


Phe 


Cys 


Pro 


Ala 


Lys 


Leu Arg 


Cys 


Tyr 


Pro 


Lys 


His 


His 








1980 








1985 








1990 








GCG 


TAC 


CAC 


CAG 


CCG 


ACT 


GTA 


CGC 


AGT 


GCC 


GTC 


CCG 


TCA 


CCC 


TTT 


CAG 


6113 


Ala 


Tyr 


His 


Gin 


Pro 


Thr 


Val 


Arg 


Ser 


Ala 


Val 


Pro 


Ser 


Pro 


Phe 


Gin 






1995 








2000 








2005 










AAC 


ACA 


CTA 


CAG 


AAC 


GTG 


CTA 


GCG 


GCC 


GCC 


ACC 


AAG 


AGA 


AAC 


TGC 


AAC 


6161 


fisn 


Thr 


Leu 


Gin 


Asn 


Val 


Leu 


Ala 


Ala 


Ala 


Thr 


Lys 


Arg 


Asn 


Cys 


Asn 












2015 








2020 








2025 




fire 


ACG 


CAA 


ATG 


CGA 


GAA 


CTA 


CCC 


ACC 


ATG 


GAC 


TCG 


GCA 


GTG 


TTC 


AAC 


6209 




Thr 


Gin 


Met 


Arg 


Glu 


Leu 


Pro 


Thr 


Met 


Asp 


Ser 


Ala 


Val 


Phe 


Asn 




u 








2030 








2035 








2040 






GAG 


TGC 


TTC 


AAG 


CGC 


TAT 


GCC 


TGC 


TCC 


GGA 


GAA 


TAT 


TGG 


GAA 


GAA 


6257 




Glu 


Cys 


Phe 


Lys 


Arg 


Tyr 


Ala 


Cys 


Ser 


Gly Glu 


Tyr 


Trp 


Glu 


Glu 




B 






2045 








2050 








2055 






*£AT 


GCT 


AAA 


CAA 


CCT 


ATC 


CGG 


ATA 


ACC 


ACT 


GAG 


AAC 


ATC 


ACT 


ACC 


TAT 


6305 


fByr 


Ala 


Lys 


Gin 


Pro 


He 


Arg 


He 


Thr 


Thr 


Glu 


Asn 


He 


Thr 


Thr 


Tyr 




f=* 




2060 








2065 








2070 








Q 

STG 


ACC 


AAA 


TTG 


AAA 


GGC 


CCG 


AAA 


GCT 


GCT 


GCC 


TTG 


TTC 


GCT 


AAG 


ACC 


6353 


'Val 


Thr 


Lys 


Leu 


Lys 


Gly 


Pro 


Lys 


Ala 


Ala 


Ala 


Leu 


Phe 


Ala 


Lys 


Thr 






207! 








2080 








2085 










CAC 


AAC 


TTG 


GTT 


CCG 


CTG 


CAG 


GAG 


GTT 


CCC 


ATG 


GAC 


AGA 


TTC 


ACG 


GTC 


6401 


His 


Asn 


Leu 


Val 


Pro 


Leu 


Gin 


Glu 


Val 


Pro 


Met 


Asp 


Arg 


Phe 


Thr 


Val 





2090 2095 2100 2105 

GAC ATG AAA CGA GAT GTC AAA GTC ACT CCA GGG ACG AAA CAC ACA GAG 644 9 

Asp Met Lys Arg Asp Val Lys Val Thr Pro Gly Thr Lys His Thr Glu 

2110 2115 2120 

GAA AGA CCC AAA GTC CAG GTA ATT CAA GCA GCG GAG CCA TTG GCG ACC 64 97 

Glu Arg Pro Lys Val Gin Val lie Gin Ala Ala Glu Pro Leu Ala Thr 
2125 2130 2135 

GCT TAC CTG TGC GGC ATC CAC AGG GAA TTA GTA AGG AGA CTA AAT GCT 654 5 

Ala Tyr Leu Cys Gly He His Arg Glu Leu Val Arg Arg Leu Asn Ala 
2140 2145 2150 
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GTG TTA CGC CCT AAC GTG CAC ACA TTG TTT GAT ATG TCG GCC GAA GAC 6 5 93 

Val Leu Arg Pro Asn Val His Thr Leu Phe Asp Met Ser Ala Glu Asp 
2155 2160 2165 

TTT GAC GCG ATC ATC GCC TCT CAC TTC CAC CCA GGA GAC CCG GTT CTA 6641 
Phe Asp Ala lie lie Ala Ser His Phe His Pro Gly Asp Pro Val Leu 
'2170 2175 2180 2185 

GAG ACG GAC ATT GCA TCA TTC GAC AAA AGC CAG GAC GAC TCC TTG GCT 66 89 

Glu Thr Asp lie Ala Ser Phe Asp Lys Ser Gin Asp Asp Ser Leu Ala 

2190 2195 2200 

CTT ACA GGT TTA ATG ATC CTC GAA GAT CTA GGG GTG GAT CAG TAC CTG 6737 
Leu Thr Gly Leu Met lie Leu Glu Asp Leu Gly Val Asp Gin Tyr Leu 
2205 2210 2215 

CTG GAC TTG ATC GAG GCA GCC TTT GGG GAA ATA TCC AGC TGT CAC CTA 6785 
Ceu Asp Leu lie Glu Ala Ala Phe Gly Glu lie Ser Ser Cys His Leu 
J 2220 2225 2230 

M:A ACT GGC ACG CGC TTC AAG TTC GGA GCT ATG ATG AAA TCG GGC ATG 6833 
Bro Thr Gly Thr Arg Phe Lys Phe Gly Ala Met Met Lys Ser Gly Met 
C 2235 2240 2245 

MrT CTG ACT TTG TTT ATT AAC ACT GTT TTG AAC ATC ACC ATA GCA AGC 6 881 

¥> S he Leu Thr Leu Phe lie Asn Thr Val Leu Asn lie Thr He Ala Ser 
J^250 2255 2260 2265 

^3G GTA CTG GAG CAG AGA CTC ACT GAC TCC GCC TGT GCG GCC TTC ATC 6929 

ferg Val Leu Glu Gin Arg Leu Thr Asp Ser Ala Cys Ala Ala Phe He 
O 2270 2275 2280 

D 

BGC GAC GAC AAC ATC GTT CAC GGA GTG ATC TCC GAC AAG CTG ATG GCG 6 977 

Gly Asp Asp Asn lie Val His Gly Val lie Ser Asp Lys Leu Met Ala 
2285 2290 2295 

GAG AGG TGC GCG TCG TGG GTC AAC ATG GAG GTG AAG ATC ATT GAC GCT 7025 
Glu Arg Cys Ala Ser Trp Val Asn Met Glu Val Lys lie lie Asp Ala 
2300 2305 2310 

GTC ATG GGC GAA AAA CCC CCA TAT TTT TGT GGG GGA TTC ATA GTT TTT 7073 
Val Met Gly Glu Lys Pro Pro Tyr Phe Cys Gly Gly Phe lie Val Phe 
2315 2320 2325 

GAC AGC GTC ACA CAG ACC GCC TGC CGT GTT TCA GAC CCA CTT AAG CGC 7121 
Asp Ser Val Thr Gin Thr Ala Cys Arg Val Ser Asp Pro Leu Lys Arg 
2330 2335 2340 2345 

CTG TTC AAG TTG GGT AAG CCG CTA ACA GCT GAA GAC AAG CAG GAC GAA 716 9 

Leu Phe Lys Leu Gly Lys Pro Leu Thr Ala Glu Asp Lys Gin Asp Glu 

2350 2355 2360 
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GAC AGG CGA CGA GCA CTG AGT GAC GAG GTT AGC AAG TGG TTC CGG ACA 7217 
Asp Arg Arg Arg Ala Leu Ser Asp Glu Val Ser Lys Trp Phe Arg Thr 
2365 2370 2375 

GGC TTG GGG GCC GAA CTG GAG GTG GCA CTA ACA TCT AGG TAT GAG GTA 7265 
Gly Leu Gly Ala Glu Leu Glu Val Ala Leu Thr Ser Arg Tyr Glu Val 
2380 2385 2390 

GAG GGC TGC AAA AGT ATC CTC ATA GCC ATG ACC ACC TTG GCG AGG GAC 7313 
Glu Gly Cys Lys Ser lie Leu lie Ala Met Thr Thr Leu Ala Arg Asp 
2395 * 2400 2405 

ATT AAG GCG TTT AAG AAA TTG AGA GGA CCT GTT ATA CAC CTC TAC GGC 73 61 

lie Lys Ala Phe Lys Lys Leu Arg Gly Pro Val He His Leu Tyr Gly 
2410 2415 2420 2425 

GGT CCT AGA TTG GTG CGT TAATACACAG AATTCTGATT ATAGCGCACT 74 0 9 

J3ly Pro Arg Leu Val Arg 
O 2430 

y3 

&TTATAGCAC C ATG AAT TAC ATC CCT ACG CAA ACG TTT TAC GGC CGC CGG 74 59 

□ Met Asn Tyr He Pro Thr Gin Thr Phe Tyr Gly Arg Arg 

H 1 5 10 

hjGG CGC CCG CGC CCG GCG GCC CGT CCT TGG CCG TTG CAG GCC ACT CCG 7507 

dfrp Arg Pro Arg Pro Ala Ala Arg Pro Trp Pro Leu Gin Ala Thr Pro 
I" 15 20 25 

O 

CSTG GCT CCC GTC GTC CCC GAC TTC CAG GCC CAG CAG ATG CAG CAA CTC 7555 

= ^al Ala Pro Val Val Pro Asp Phe Gin Ala Gin Gin Met Gin Gin Leu 
fj30 35 40 45 

NVTC AGC GCC GTA AAT GCG CTG ACA ATG AGA CAG AAC GCA ATT GCT CCT 76 03 

Htle Ser Ala Val Asn Ala Leu Thr Met Arg Gin Asn Ala He Ala Pro 

50 55 60 

GCT AGG CCT CCC AAA CCA AAG AAG AAG AAG ACA ACC AAA CCA AAG CCG 7651 
Ala Arg Pro Pro Lys Pro Lys Lys Lys Lys Thr Thr Lys Pro Lys Pro 
65 70 75 

AAA ACG CAG CCC AAG AAG ATC AAC GGA AAA ACG CAG CAG CAA AAG AAG 7699 
Lys Thr Gin Pro Lys Lys He Asn Gly Lys Thr Gin Gin Gin Lys Lys 
80 85 90 

AAA GAC AAG CAA GCC GAC AAG AAG AAG AAG AAA CCC GGA AAA AGA GAA 7747 
Lys Asp Lys Gin Ala Asp Lys Lys Lys Lys Lys Pro Gly Lys Arg Glu 
95 100 105 

AGA ATG TGC ATG AAG ATT GAA AAT GAC TGT ATC TTC GAA GTC AAA CAC 7795 
Arg Met Cys Met Lys He Glu Asn Asp Cys He Phe Glu Val Lys His 
110 115 120 125 
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GAA GGA AAG GTC ACT GGG TAC GCC TGC CTG GTG GGC GAC AAA GTC ATG 7843 
Glu Gly Lys Val Thr Gly Tyr Ala Cys Leu Val Gly Asp Lys Val Met 

130 135 140 

AAA CCT GCC CAC GTG AAA GGA GTC ATC GAC AAC GCG GAC CTG GCA AAG 7891 
Lys Pro Ala His Val Lys Gly Val lie Asp Asn Ala Asp Leu Ala Lys 
145 150 155 

CTA GCT TTC AAG AAA TCG AGC AAG TAT GAC CTT GAG TGT GCC CAG ATA 793 9 

Leu Ala Phe Lys Lys Ser Ser Lys Tyr Asp Leu Glu Cys Ala Gin lie 
160 165 170 

CCA GTT CAC ATG AGG TCG GAT GCC TCA AAG TAC ACG CAT GAG AAG CCC 7 9 87 

Pro Val His Met Arg Ser Asp Ala Ser Lys Tyr Thr His Glu Lys Pro 
175 180 185 

GAG GGA CAC TAT AAC TGG CAC CAC GGG GCT GTT CAG TAC AGC GGA GGT 8 03 5 

Glu Gly His Tyr Asn Trp His His Gly Ala Val Gin Tyr Ser Gly Gly 
1^0 195 200 205 

&§G TTC ACT ATA CCG ACA GGA GCG GGC AAA CCG GGA GAC AGT GGC CGG 8083 
^Sg Phe Thr lie Pro Thr Gly Ala Gly Lys Pro Gly Asp Ser Gly Arg 

210 215 220 

M« 

QDC ATC TTT GAC AAC AAG GGG AGG GTA GTC GCT ATC GTC CTG GGC GGG 8131 
Efflo lie Phe Asp Asn Lys Gly Arg Val Val Ala lie Val Leu Gly Gly 
a 225 230 235 

o 

* GqC AAC GAG GGC TCA CGC ACA GCA CTG TCG GTG GTC ACC TGG AAC AAA 8179 

Ala Asn Glu Gly Ser Arg Thr Ala Leu Ser Val Val Thr Trp Asn Lys 
h 240 245 250 

q|T ATG GTG ACT AGA GTG ACC CCC GAG GGG TCC GAA GAG TGG TCC GCC 8227 

Asp Met Val Thr Arg Val Thr Pro Glu Gly Ser Glu Glu Trp Ser Ala 
255 260 265 

CCG CTG ATT ACT GCC ATG TGT GTC CTT GCC AAT GCT ACC TTC CCG TGC 8275 
Pro Leu lie Thr Ala Met Cys Val Leu Ala Asn Ala Thr Phe Pro Cys 
270 275 280 285 

TTC CAG CCC CCG TGT GTA CCT TGC TGC TAT GAA AAC AAC GCA GAG GCC 83 23 

Phe Gin Pro Pro Cys Val Pro Cys Cys Tyr Glu Asn Asn Ala Glu Ala 

290 295 300 

ACA CTA CGG ATG CTC GAG GAT AAC GTG GAT AGG CCA GGG TAC TAC GAC 83 71 

Thr Leu Arg Met Leu Glu Asp Asn Val Asp Arg Pro Gly Tyr Tyr Asp 
305 310 315 

CTC CTT CAG GCA GCC TTG ACG TGC CGA AAC GGA ACA AGA CAC CGG CGC 8419 
Leu Leu Gin Ala Ala Leu Thr Cys Arg Asn Gly Thr Arg His Arg Arg 
320 325 330 
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AGC GTG TCG CAA CAC TTC AAC GTG TAT AAG GCT ACA CGC CCT TAC ATC 8467 
Ser Val Ser Gin His Phe Asn Val Tyr Lys Ala Thr Arg Pro Tyr lie 
335 340 345 

GCG TAC TGC GCC GAC TGC GGA GCA GGG CAC TCG TGT CAT AGC CCC GTA 8515 
Ala Tyr Cys Ala Asp Cys Gly Ala Gly His Ser Cys His Ser Pro Val 
-350 355 360 365 

GCA ATT GAA GCG GTC AGG TCC GAA GCT ACC GAC GGG ATG CTG AAG ATT 8563 
Ala lie Glu Ala Val Arg Ser Glu Ala Thr Asp Gly Met Leu Lys lie 

370 375 380 

CAG TTC TCG GCA CAA ATT GGC ATA GAT AAG AGT GAC AAT CAT GAC TAC 8 611 

Gin Phe Ser Ala Gin lie Gly lie Asp Lys Ser Asp Asn His Asp Tyr 
385 390 395 

ACG AAG ATA AGG TAC GCA GAC GGG CAC GCC ATT GAG AAT GCC GTC CGG 86 59 

ir Lys lie Arg Tyr Ala Asp Gly His Ala lie Glu Asn Ala Val Arg 
400 405 410 

%y 

IPSA TCT TTG AAG GTA GCC ACC TCC GGA GAC TGT TTC GTC CAT GGC ACA 8707 
^r Ser Leu Lys Val Ala Thr Ser Gly Asp Cys Phe Val His Gly Thr 
H 415 420 425 

£=& 

JETG GGA CAT TTC ATA CTG GCA AAG TGC CCA CCG GGT GAA TTC CTG CAG 8 755 

lgfet Gly His Phe He Leu Ala Lys Cys Pro Pro Gly Glu Phe Leu Gin 
430 435 440 445 

Q 

GTC TCG ATC CAG GAC ACC AGA AAC GCG GTC CGT GCC TGC AGA ATA CAA 88 03 

yjil Ser He Gin Asp Thr Arg Asn Ala Val Arg Ala Cys Arg He Gin 
9 n 450 455 460 

pVT CAT CAT GAC CCT CAA CCG GTG GGT AGA GAA AAA TTT ACA ATT AGA 8851 
Tyr His His Asp Pro Gin Pro Val Gly Arg Glu Lys Phe Thr lie Arg 
465 470 475 

CCA CAC TAT GGA AAA GAG ATC CCT TGC ACC ACT TAT CAA CAG ACC ACA 8899 
Pro His Tyr Gly Lys Glu lie Pro Cys Thr Thr Tyr Gin Gin Thr Thr 
480 485 490 

GCG AAG ACC GTG GAG GAA ATC GAC ATG CAT ATG CCG CCA GAT ACG CCG 8947 
Ala Lys Thr Val Glu Glu lie Asp Met His Met Pro Pro Asp Thr Pro 
495 500 505 

GAC AGG ACG TTG CTA TCA CAG CAA TCT GGC AAT GTA AAG ATC ACA GTC 8995 
Asp Arg Thr Leu Leu Ser Gin Gin Ser Gly Asn Val Lys lie Thr Val 
510 515 520 525 

GGA GGA AAG AAG GTG AAA TAC AAC TGC ACC TGT GGA ACC GGA AAC GTT 9043 
Gly Gly Lys Lys Val Lys Tyr Asn Cys Thr Cys Gly Thr Gly Asn Val 

530 535 540 
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GGC ACT ACT AAT TCG GAC ATG ACG ATC AAC ACG TGT CTA ATA GAG CAG 9091 
Gly Thr Thr Asn Ser Asp Met Thr lie Asn Thr Cys Leu lie Glu Gin 
545 550 555 

TGC CAC GTC TCA GTG ACG GAC CAT AAG AAA TGG CAG TTC AAC TCA CCT 913 9 

Cys His Val Ser Val Thr Asp His Lys Lys Trp Gin Phe Asn Ser Pro 
560 *' 565 570 

TTC GTC CCG AGA GCC GAC GAA CCG GCT AGA AAA GGC AAA GTC CAT ATC 9187 
Phe Val Pro Arg Ala Asp Glu Pro Ala Arg Lys Gly Lys Val His lie 
575 580 585 

CCA TTC CCG TTG GAC AAC ATC ACA TGC AGA GTT CCA ATG GCG CGC GAA 923 5 

Pro Phe Pro Leu Asp Asn lie Thr Cys Arg Val Pro Met Ala Arg Glu 
590 595 600 605 

CCA ACC GTC ATC CAC GGC AAA AGA GAA GTG ACA CTG CAC CTT CAC CCA 9283 
Prp Thr Val lie His Gly Lys Arg Glu Val Thr Leu His Leu His Pro 
W 610 615 620 

y3 

G§T CAT CCC ACG CTC TTT TCC TAC CGC ACA CTG GGT GAG GAC CCG CAG 9331 

A§£> His Pro Thr Leu Phe Ser Tyr Arg Thr Leu Gly Glu Asp Pro Gin 

M= 625 630 635 

T®T CAC GAG GAA TGG GTG ACA GCG GCG GTG GAA CGG ACC ATA CCC GTA 9379 
Tfjr His Glu Glu Trp Val Thr Ala Ala Val Glu Arg Thr lie Pro Val 
. 640 645 650 

CCA GTG GAC GGG ATG GAG TAC CAC TGG GGA AAC AAC GAC CCA GTG AGG 9427 
Paso Val Asp Gly Met Glu Tyr His Trp Gly Asn Asn Asp Pro Val Arg 
Z 655 660 665 

(ft TGG TCT CAA CTC ACC ACT GAA GGG AAA CCG CAC GGC TGG CCG CAT 9475 
Liu Trp Ser Gin Leu Thr Thr Glu Gly Lys Pro His Gly Trp Pro His 
670 675 680 685 

CAG ATC GTA CAG TAC TAC TAT GGG CTT TAC CCG GCC GCT ACA GTA TCC 9523 
Gin lie Val Gin Tyr Tyr Tyr Gly Leu Tyr Pro Ala Ala Thr Val Ser 

690 695 700 

GCG GTC GTC GGG ATG AGC TTA CTG GCG TTG ATA TCG ATC TTC GCG TCG 9571 
Ala Val Val Gly Met Ser Leu Leu Ala Leu lie Ser lie Phe Ala Ser 
705 710 715 

TGC TAC ATG CTG GTT GCG GCC CGC AGT AAG TGC TTG ACC CCT TAT GCT 9619 
Cys Tyr Met Leu Val Ala Ala Arg Ser Lys Cys Leu Thr Pro Tyr Ala 
720 725 730 

TTA ACA CCA GGA GCT GCA GTT CCG TGG ACG CTG GGG ATA CTC TGC TGC 9667 
Leu Thr Pro Gly Ala Ala Val Pro Trp Thr Leu Gly lie Leu Cys Cys 
735 740 745 
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GCC CCG CGG GCG CAC GCA GCT AGT GTG GCA GAG ACT ATG GCC .TAC TTG 9715 
Ala Pro Arg Ala His Ala Ala Ser Val Ala Glu Thr Met Ala Tyr Leu 
750 755 760 765 

TGG GAC CAA AAC CAA GCG TTG TTC TGG TTG GAG TTT GCG GCC CCT GTT 9763 
Trp Asp Gin Asn Gin Ala Leu Phe Trp Leu Glu Phe Ala Ala Pro Val 

770 775 780 

GCC TGC ATC CTC ATC ATC ACG TAT TGC CTC AGA AAC GTG CTG TGT TGC 9811 
Ala Cys lie Leu lie lie Thr Tyr Cys Leu Arg Asn Val Leu Cys Cys 
785 790 795 

TGT AAG AGC CTT TCT TTT TTA GTG CTA CTG AGC CTC GGG GCA ACC GCC 9 859 

Cys Lys Ser Leu Ser Phe Leu Val Leu Leu Ser Leu Gly Ala Thr Ala 
800 805 810 

AGA GCT TAC GAA CAT TCG ACA GTA ATG CCG AAC GTG GTG GGG TTC CCG 9907 
^arg Ala Tyr Glu His Ser Thr Val Met Pro Asn Val Val Gly Phe Pro 
*f 815 820 825 

T^T AAG GCT CAC ATT GAA AGG CCA GGA TAT AGC CCC CTC ACT TTG CAG 9955 
f^r Lys Ala His lie Glu Arg Pro Gly Tyr Ser Pro Leu Thr Leu Gin 
830 835 840 845 

AfG CAG GTT GTT GAA ACC AGC CTC GAA CCA ACC CTT AAT TTG GAA TAC 10003 

Hit Gin Val Val Glu Thr Ser Leu Glu Pro Thr Leu Asn Leu Glu Tyr 

l_ 850 .855 860 

P 

M"A ACC TGT GAG TAC AAG ACG GTC GTC CCG TCG CCG TAC GTG AAG TGC 10051 

tie Thr Cys Glu Tyr Lys Thr Val Val Pro Ser Pro Tyr Val Lys Cys 
p 865 870 875 

TGC GGC GCC TCA GAG TGC TCC ACT AAA GAG AAG CCT GAC TAC CAA TGC 10 099 

6ys Gly Ala Ser Glu Cys Ser Thr Lys Glu Lys Pro Asp Tyr Gin Cys 
880 885 890 

AAG GTT TAC ACA GGC GTG TAC CCG TTC ATG TGG GGA GGG GCA TAT TGC 10147 
Lys Val Tyr Thr Gly Val Tyr Pro Phe Met Trp Gly Gly Ala Tyr Cys 
895 900 905 

TTC TGC GAC TCA GAA AAC ACG CAA CTC AGC GAG GCG TAC GTC GAT CGA 10195 
Phe Cys Asp Ser Glu Asn Thr Gin Leu Ser Glu Ala Tyr Val Asp Arg 
910 915 920 925 

TCG GAC GTA TGC AGG CAT GAT CAC GCA TCT GCT TAC AAA GCC CAT ACA 10243 
Ser Asp Val Cys Arg His Asp His Ala Ser Ala Tyr Lys Ala His Thr 

930 935 940 

GCA TCG CTG AAG GCC AAA GTG AGG GTT ATG TAC GGC AAC GTA AAC CAG 10291 
Ala Ser Leu Lys Ala Lys Val Arg Val Met Tyr Gly Asn Val Asn Gin 
945 950 955 
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ACT GTG GAT GTT TAC GTG AAC GGA GAC CAT GCC GTC ACG ATA GGG GGT 10339 
Thr Val Asp Val Tyr Val Asn Gly Asp His Ala Val Thr He Gly Gly 
960 965 970 

ACT CAG TTC ATA TTC GGG CCG CTG TCA TCG GCC TGG ACC CCG TTC GAC 10387 
Thr Gin Phe He Phe Gly Pro Leu Ser Ser Ala Trp Thr Pro Phe Asp 
975 980 985 

AAC AAG ATA GTC GTG TAC AAA GAC GAA GTG TTC AAT CAG GAC TTC CCG 10435 
Asn Lys He Val Val Tyr Lys Asp Glu Val Phe Asn Gin Asp Phe Pro 
990 995 1000 1005 

CCG TAC GGA TCT GGG CAA CCA GGG CGC TTC GGC GAC ATC CAA AGC AGA 10483 
Pro Tyr Gly Ser Gly Gin Pro Gly Arg Phe Gly Asp He Gin Ser Arg 

1010 1015 1020 

ACA GTG GAG AGT AAC GAC CTG TAC GCG AAC ACG GCA CTG AAG CTG GCA 10531 
Thr Val Glu Ser Asn Asp Leu Tyr Ala Asn Thr Ala Leu Lys Leu Ala 
= 1025 1030 1035 

cSb CCT TCA CCC GGC ATG GTC CAT GTA CCG TAC ACA CAG ACA CCT TCA 10579 
Am Pro Ser Pro Gly Met Val His Val Pro Tyr Thr Gin Thr Pro Ser 



1040 1045 1050 

Gfeb TTC AAA TAT TGG CTA AAG GAA AAA GGG ACA GCC CTA AAT ACG AAG 10627 
G|y Phe Lys Tyr Trp Leu Lys Glu Lys Gly Thr Ala Leu Asn Thr Lys 
m 1055 1060 1065 

3 

G& CCT TTT GGC TGC CAA ATC AAA ACG AAC CCT GTC AGG GCC ATG AAC 10675 

A3?a Pro Phe Gly Cys Gin He Lys Thr Asn Pro Val Arg Ala Met Asn 
lptfO 1075 1080 1085 

TQC GCC GTG GGA AAC ATC CCT GTC TCC ATG AAT TTG CCT GAC AGC GCC 10723 
Cyb Ala Val Gly Asn He Pro Val Ser Met Asn Leu Pro Asp Ser Ala 

1090 1095 1100 

TTT ACC CGC ATT GTC GAG GCG CCG ACC ATC ATT GAC CTG ACT TGC ACA 10771 
Phe Thr Arg He Val Glu Ala Pro Thr He He Asp Leu Thr Cys Thr 
1105 1110 1115 

GTG GCT ACC TGT ACG CAC TCC TCG GAT TTC GGC GGC GTC TTG ACA CTG 10819 
Val Ala Thr Cys Thr His Ser Ser Asp Phe Gly Gly Val Leu Thr Leu 
1120 1125 1130 

ACG TAC AAG ACC AAC AAG AAC GGG GAC TGC TCT GTA CAC TCG CAC TCT 10867 
Thr Tyr Lys Thr Asn Lys Asn Gly Asp Cys Ser Val His Ser His Ser 
1135 1140 1145 



AAC GTA GCT ACT CTA CAG GAG GCC ACA GCA AAA GTG AAG ACA GCA GGT 
Asn Val Ala Thr Leu Gin Glu Ala Thr Ala Lys Val Lys Thr Ala Gly 
1150 1155 1160 1165 



10915 



71 

AAG GTG ACC TTA CAC TTC TCC ACG GCA AGC GCA TCA CCT TCT TTT GTG 10963 
Lys Val Thr Leu His Phe Ser Thr Ala Ser Ala Ser Pro Ser Phe Val 

H70 1175 1180 

GTG TCG CTA TGC AGT GCT AGG GCC ACC TGT TCA GCG TCG TGT GAG CCC 11011 
Val Ser Leu Cys Ser Ala Arg Ala Thr Cys Ser Ala Ser Cys Glu Pro 
1185 •• 1190 1195 

CCG AAA GAC CAC ATA GTC CCA TAT GCG GCT AGC CAC AGT AAC GTA GTG 11059 
Pro Lys Asp His He Val Pro Tyr Ala Ala Ser His Ser Asn Val Val 
1200 1205 1210 

TTT CCA GAC ATG TCG GGC ACC GCA CTA TCA TGG GTG CAG AAA ATC TCG 11107 
Phe Pro Asp Met Ser Gly Thr Ala Leu Ser Trp Val Gin Lys He Ser 
1215 1220 1225 

GGT GGT CTG GGG GCC TTC GCA ATC GGC GCT ATC CTG GTG CTG GTT GTG 11155 
J31y Gly Leu Gly Ala Phe Ala He Gly Ala He Leu Val Leu Val Val 
^1230 1235 1240 1245 

®TC ACT TGC ATT GGG CTC CGC AGA TAAGTTAGGG TAGGCAATGG CATTGATATA 11209 
Qfal Thr Cys He Gly Leu Arg Arg 
U 1250 
h* 

QpCAAGAAAAT TGAAAACAGA AAAAGTTAGG GTAAGCAATG GCATATAACC ATAACTGTAT 11269 

JAACTTGTAAC AAAGCGCAAC AAGACCTGCG CAATTGGCCC CGTGGTCCGC CTCACGGAAA 11329 

bpCGGGGCAA CTCATATTGA CACATTAATT GGCAATAATT GGAAGCTTAC ATAAGCTTAA 113 89 

JjTCGACGAAT AATTGGATTT TTATTTTATT TTGCAATTGG TTTTTAATAT TTCCAAAAAA 1144 9 

jAAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 11509 
AAAACTAG 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2431 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

Met Ala Ala Lys Val His Val Asp He Glu Ala Asp Ser Pro Phe He 
1 5 10 15 

Lys Ser Leu Gin Lys Ala Phe Pro Ser Phe Glu Val Glu Ser Leu Gin 
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# 
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20 



25 



30 



Val Thr Pro Asn Asp Kis Ala Asn Ala Arg Ala Phe Ser His Leu Ala 
35 40 45 

Thr Lys Leu lie Glu Gin Glu Thr Asp Lys Asp Thr Leu lie Leu Asp 
50 55 60 

lie Gly Ser Ala Pro Ser Arg Arg Met Met Ser Thr His Lys Tyr His 
65 70 75 80 

Cys Val Cys Pro Met Arg Ser Ala Glu Asp Pro Glu Arg Leu Asp Ser 

85 90 95 

Tyr Ala Lys Lys Leu Ala Ala Ala Ser Gly Lys Val Leu Asp Arg Glu 
100 105 110 

lie Ala Gly Lys lie Thr Asp Leu Gin Thr Val Met Ala Thr Pro Asp 
^ 115 120 125 

AjEk Glu Ser Pro Thr Phe Cys Leu His Thr Asp Val Thr Cys Arg Thr 
^ 130 135 140 

Atea Ala Glu Val Ala Val Tyr Gin Asp Val Tyr Ala Val His Ala Pro 



Thr Ser Leu Tyr His Gin Ala Met Lys Gly Val Arg Thr Ala Tyr Trp 
□ 165 170 175 

Ilka Gly Phe Asp Thr Thr Pro Phe Met Phe Asp Ala Leu Ala Gly Ala 
h 180 185 190 

Q 

Tpr Pro Thr Tyr Ala Thr Asn Trp Ala Asp Glu Gin Val Leu Gin Ala 
195 200 205 

Arg Asn lie Gly Leu Cys Ala Ala Ser Leu Thr Glu Gly Arg Leu Gly 
210 215 220 

Lys Leu Ser lie Leu Arg Lys Lys Gin Leu Lys Pro Cys Asp Thr Val 
225 230 235 240 

Met Phe Ser Val Gly Ser Thr Leu Tyr Thr Glu Ser Arg Lys Leu Leu 

245 250 255 

Arg Ser Trp His Leu Pro Ser Val Phe His Leu Lys Gly Lys Gin Ser 
260 265 270 

Phe Thr Cys Arg Cys Asp Thr lie Val Ser Cys Glu Gly Tyr Val Val 
275 280 285 

Lys Lys lie Thr Met Cys Pro Gly Leu Tyr Gly Lys Thr Val Gly Tyr 
290 295 300 



m 



150 



155 



160 
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Ala 
305 


Val 


Thr 


Tyr 


His 


Ala 
310 


Glu Gly 


Phe 


Leu 


Val 
315 


Cys 


Lys 


Thr 


Thr 


Asp 
320 


Thr 


Val 


Lys 


Gly 


Glu 
325 


Arg 


Val 


Ser 


Phe 


Pro 
330 


Val 


Cys 


Thr 


Tyr 


Val 
335 


Pro 


Ser 


Thr 


He 


Cys 
340 


Asp 


Gin 


Met 


Thr 


Gly 
345 


He 


Leu 


Ala 


Thr 


Asp 
350 


Val 


Thr 


Pro 


Glu 


Asp 
355 


Ala 


Gin 


Lys 


Leu 


Leu 
360 


Val 


Gly 


Leu 


Asn 


Gin 
365 


Arg 


He 


Val 


Val 


Asn 
370 


Gly 


Arg 


Thr 


Gin 


Arg 
375 


Asn 


Thr 


Asn 


Thr 


Met 
380 


Lys 


Asn 


Tyr 


Leu 


Leu 
385 


Pro 


He 


Val 


Ala 


Val 
390 


Ala 


Phe 


Ser 


Lys 


Trp 
395 


Ala 


Arg 


Glu 


Tyr 


Lys 
400 




Asp 


Leu 


Asp 


Asp 
405 


Glu 


Lys 


Pro 


Leu Gly Val 
410 


Arg 


Glu 


Arg 


Ser 
415 


Leu 




Cys 


Cys 


Cys 
420 


Leu 


Trp 


Ala 


Phe 


Lys 
425 


Thr 


Arg 


Lys 


Met 


His 
430 


Thr 


Met 


m 


Lys 


Lys 
435 


Pro 


Asp 


Thr 


Gin 


Thr 
440 


He 


Val 


Lys 


Val 


Pro 
445 


Ser 


Glu 


Phe 




Ser 
450 


Phe 


Val 


lie 


Pro 


Ser 
455 


Leu 


Trp 


Ser 


Thr 


Gly 
460 


Leu 


Ala 


He 


Pro 


\fefci 


Arg 


Ser 


Arg 


He 


Lys 
470 


Met 


Leu 


Leu 


Ala 


Lys 
475 


Lys 


Thr 


Lys 


Arg 


Glu 
480 


lieu 


lie 


Pro 


Val 


Leu 
485 


Asp 


Ala 


Ser 


Ser 


Ala 
490 


Arg 


Asp 


Ala 


Glu 


Gin 
495 


Glu 


Glu 


Lys 


Glu 


Arg 
500 


Leu 


Glu 


Ala 


Glu 


Leu 
505 


Thr 


Arg 


Glu 


Ala 


Leu 
510 


Pro 


Pro 


Leu 


Val 


Pro 
515 


He 


Ala 


Pro 


Ala 


Glu 
520 


Thr Gly Val 


Val 


Asp 
525 


Val 


Asp 


Val 


Glu 


Glu 
530 


Leu 


Glu 


Tyr 


His 


Ala 
535 


Gly 


Ala Gly Val 


Val 
540 


Glu 


Thr 


Pro 


Arg 


Ser 
545 


Ala 


Leu 


Lys 


Val 


Thr 
550 


Ala 


Gin 


Pro 


Asn 


Asp 
555 


Val 


Leu 


Leu 


Gly Asn 
560 


Tyr 


Val 


Val 


Leu 


Ser 
565 


Pro 


Gin 


Thr 


Val 


Leu 
570 


Lys 


Ser 


Ser 


Lys 


Leu 
575 


Ala 



Pro 


Val 


His 


Pro 
580 


Arg 


Ala 


Gly 
595 


Gly 


Pro 


Cys 
610 


Gly 


Ser 


Ser 
625 


Ala 


Thr 


Met 


Tyr 


His 


He 


Ala 


Tyr 


Glu 


Lys 


Val 
660 


\fal 


Asp 


Lys 

675 


Lys 


y* 


Val 
690 


Gly 


Glu 


m 


Leu 


Lys 


He 


Val 
Pi 


Phe 


Gly 


Val 


O 


Thr 


Lys 


His 
740 


1 5 

E 


He 


Val 
755 


Asn 


Glu 


Asn 
770 


Ser 


Asp 


He 
785 


Leu 


Tyr 


Val 


Ala 


Leu 


He 


Ala 


Asp 


Pro 


Lys 


Gin 
820 


Phe 


Asn 


His 
835 


Asn 



# 



Leu Ala Glu Gin 



Tyr Gin Val Asp 
600 

Ala He Pro Val 
615 

Val Tyr Asn Glu 
630 

Val His Gly Pro 
645 

Arg Ala Glu Arg 



Cys Cys Val Lys 
680 

Leu Thr Asn Pro 
695 

Arg Pro Ser Ala 
710 

Pro Gly Ser Gly 
725 

Asp Leu Val Thr 



Asp Val Lys Lys 
760 

Ser He Leu Leu 
775 

Asp Glu Ala Phe 
790 

Leu Val Lys Pro 
805 

Cys Gly Phe Phe 



He Cys Thr Glu 
840 
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Val Lys He He 
585 

Gly Tyr Asp Gly 



Pro Glu Phe Gin 
620 

Arg Glu Phe Val 
635 

Ser Leu Asn Thr 
650 

Thr Asp Ala Glu 
665 

Arg Glu Glu Ala 



Pro Phe His Glu 
700 

Pro Tyr Lys Thr 
715 

Lys~Ser Ala lie 
730 

Ser Gly Lys Lys 
745 

His Arg Gly Lys 



Asn Gly Cys Arg 
780 

Ala Cys His Ser 
795 

Arg Ser Lys Val 
810 

Asn Met Met Gin 
825 

Val Cys His Lys 




Thr His Asn Gly 
590 

Arg Val Leu Leu 
605 

Ala Leu Ser Glu 



Asn Arg Lys Leu 
640 

Asp Glu Glu Asn 
655 

Tyr Val Phe Asp 
670 

Ser Gly Leu Val 
685 

Phe Ala Tyr Glu 



Thr Val Val Gly 
720 

He Lys Ser Leu 
735 

Glu Asn Cys Gin 
750 

Gly Thr Ser Arg 
765 

Arg Ala Val Asp 



Gly Thr Leu Leu 
800 

Val Leu Cys Gly 
815 

Leu Lys Val Asn 
830 

Ser He Ser Arg 
845 
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Arg Cys Thr Arg Pro Val Thr Ala lie Val Ser Thr Leu His Tyr Gly 
850 855 860 

Gly Lys Met Arg Thr Thr Asn Pro Cys Asn Lys Pro lie lie lie Asp 
865 870 875 880 

Thr Thr Gly Gin Thr Lys Pro Lys Pro Gly Asp lie Val Leu Thr Cys 

885 890 895 

Phe Arg Gly Trp Ala Lys Gin Leu Gin Leu Asp Tyr Arg Gly His Glu 
900 905 910 

Val Met Thr Ala Ala Ala Ser Gin Gly Leu Thr Arg Lys Gly Val Tyr 
915 920 925 

Ala Val Arg Gin Lys Val Asn Glu Asn Pro Leu Tyr Ala Pro Ala Ser 
930 935 940 

gSi His Val Asn Val Leu Leu Thr Arg Thr Glu Asp Arg Leu Val Trp 
94p| 950 955 960 

Lyi Thr Leu Ala Gly Asp Pro Trp lie Lys Val Leu Ser Asn lie Pro 
H 965 970 975 

Ljl. 

gBi Gly Asn Phe Thr Ala Thr Leu Glu Glu Trp Gin Glu Glu His Asp 
m 980 985 990 

B 

L^b lie Met Lys Val lie Glu Gly Pro Ala Ala Pro Val Asp Ala Phe 
%a 995 1000 1005 

Gfai Asn Lys Ala Asn Val Cys Trp Ala Lys Ser Leu Val Pro Val Leu 
n 1010 1015 1020 

Asp Thr Ala Gly lie Arg Leu Thr Ala Glu Glu Trp Ser Thr lie lie 
1025 1030 1035 1040 

Thr Ala Phe Lys Glu Asp Arg Ala Tyr Ser Pro Val Val Ala Leu Asn 

1045 10.50 1055 

Glu lie Cys Thr Lys Tyr Tyr Gly Val Asp Leu Asp Ser Gly Leu Phe 
1060 1065 1070 



Ser Ala Pro Lys Val Ser Leu Tyr Tyr Glu Asn Asn His Trp Asp Asn 
1075 1080 1085 

Arg Pro Gly Gly Arg Met Tyr Gly Phe Asn Ala Ala Thr Ala Ala Arg 
1090 1095 1100 

Leu Glu Ala Arg His Thr Phe Leu Lys Gly Gin Trp His Thr Gly Lys 
1105 1110 1115 1120 
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Gin Ala Val lie Ala Glu Arg Lys lie Gin Pro Leu Ser Val Leu Asp 

1125 1130 1135 

Asn Val lie Pro lie Asn Arg Arg Leu Pro His Ala Leu Val Ala Glu 
1140 1145 1150 

Tyr Lys Thr Val Lys Gly Ser Arg Val Glu Trp Leu Val Asn Lys Val 
1155 1160 1165 

Arg Gly Tyr His Val Leu Leu Val Ser Glu Tyr Asn Leu Ala Leu Pro 
1170 1175 1180 

Arg Arg Arg Val Thr Trp Leu Ser Pro Leu Asn Val Thr Gly Ala Asp 
1185 1190 1195 1200 

Arg Cys Tyr Asp Leu Ser Leu Gly Leu Pro Ala Asp Ala Gly Arg Phe 

1205 1210 1215 

Asp Leu Val Phe Val Asn lie His Thr Glu Phe Arg lie His His Tyr 
% 1220 1225 1230 

if 3 ! 

Gin Gin Cys Val Asp His Ala Met Lys Leu Gin Met Leu Gly Gly Asp 
r 1235 1240 1245 

fla Leu Arg Leu Leu Lys Pro Gly Gly lie Leu Met Arg Ala Tyr Gly 
O" 1 1250 1255 1260 

ll^r Ala Asp Lys lie Ser Glu Ala Val Val Ser Ser Leu Ser Arg Lys 
M65 1270 1275 1280 

M- 

Qie Ser Ser Ala Arg Val Leu Arg Pro Asp Cys Val Thr Ser Asn Thr 
Q 1285 1290 1295 

Glu Val Phe Leu Leu Phe Ser Asn Phe Asp Asn Gly Lys Arg Pro Ser 
1300 1305 1310 

Thr Leu His Gin Met Asn Thr Lys Leu Ser Ala Val Tyr Ala Gly Glu 
1315 1320 1325 

Ala Met His Thr Ala Gly Cys Ala Pro Ser Tyr Arg Val Lys Arg Ala 
1330 1335 1340 

Asp lie Ala Thr Cys Thr Glu Ala Ala Val Val Asn Ala Ala Asn Ala 
1345 1350 1355 1360 

Arg Gly Thr Val Gly Asp Gly Val Cys Arg Ala Val Ala Lys Lys Trp 

1365 1370 1375 

Pro Ser Ala Phe Lys Gly Ala Ala Thr Pro Val Gly Thr lie Lys Thr 
1380 1385 1390 

Val Met Cys Gly Ser Tyr Pro Val lie His Ala Val Ala Pro Asn Phe 
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1395 



1400 



1405 



Ser Ala Thr Thr Glu Ala Glu Gly Asp Arg Glu Leu Ala Ala Val Tyr 
1410 1415 1420 

Arg Ala Val Ala Ala Glu Val Asn Arg Leu Ser Leu Ser Ser Val Ala 
1425 1430 1435 1440 

lie Pro Leu Leu Ser Thr Gly Val Phe Ser Gly Gly Arg Asp Arg Leu 

1445 1450 1455 

Gin Gin Ser Leu Asn His Leu Phe Thr Ala Met Asp Ala Thr Asp Ala 
1460 1465 1470 

Asp Val Thr lie Tyr Cys Arg Asp Lys Ser Trp Glu Lys Lys lie Gin 
1475 1480 1485 

Glu Ala lie Asp Met Arg Thr Ala Val Glu Leu Leu Asn Asp Asp Val 



1490 



1495 



1500 



^Lu Leu Thr Thr Asp Leu Val Arg Val His Pro Asp Ser Ser Leu Val 
tS05 1510 1515 1520 

fey Arg Lys Gly Tyr Ser Thr Thr Asp Gly Ser Leu Tyr Ser Tyr Phe 
O 1525 1530 1535 

EP 

Glu Gly Thr Lys Phe Asn Gin Ala Ala lie Asp Met Ala Glu lie Leu 
Q 1540 1545 1550 

*Sf 

mtir Leu Trp Pro Arg Leu Gin Glu Ala Asn Glu Gin lie Cys Leu Tyr 
p 1555 1560 1565 

Sla Leu Gly Glu Thr Met Asp Asn lie Arg Ser Lys Cys Pro Val Asn 
1570 1575 1580 

Asp Ser Asp Ser Ser Thr Pro Pro Arg Thr Val Pro Cys Leu Cys Arg 
1585 1590 1595 1600 

Tyr Ala Met Thr Ala Glu Arg lie Ala Arg Leu Arg Ser His Gin Val 

1605 1610 1615 

Lys Ser Met Val Val Cys Ser Ser Phe Pro Leu Pro Lys Tyr His Val 
1620 1625 1630 

Asp Gly Val Gin Lys Val Lys Cys Glu Lys Val Leu Leu Phe Asp Pro 
1635 1640 1645 

Thr Val Pro Ser Val Val Ser Pro Arg Lys Tyr Ala Ala Ser Thr Thr 
1650 1655 1660 

Asp His Ser Asp Arg Ser Leu Arg Gly Phe Asp Leu Asp Trp Thr Thr 
1665 1670 1675 1680 
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Asp Ser Ser Ser Thr Ala Ser Asp Thr Met Ser Leu Pro Ser Leu Gin 

1685 1690 1695 

Ser Cys Asp lie Asp Ser lie Tyr Glu Pro Met Ala Pro lie Val Val 
1700 1705 1710 

Thr Ala Asp Val His Pro Glu Pro Ala Gly He Ala Asp Leu Ala Ala 
1715 1720 1725 

Asp Val His Pro Glu Pro Ala Asp His Val Asp Leu Glu Asn Pro He 
1730 1735 1740 

Pro Pro Pro Arg Pro Lys Arg Ala Ala Tyr Leu Ala Ser Arg Ala Ala 
fYJTkS 1750 1755 * 1760 

Q0 Glu Arg Pro Val Pro Ala Pro Arg Lys Pro Thr Pro Ala Pro Arg Thr 
^ 1765 1770 1775 

o 

Afe Phe Arg Asn Lys Leu Pro Leu Thr Phe Gly Asp Phe Asp Glu His 
Jj 1780 • 1785 1790 

gSjJj Val Asp Ala Leu Ala Ser Gly He Thr Phe Gly Asp Phe Asp Asp 
;. 1795 1800 1805 

~1 Leu Arg Leu Gly Arg Ala Gly Ala Tyr He Phe Ser Ser Asp Thr 
ys 1810 1815 1820 

Ser Gly His Leu Gin Gin Lys Ser Val Arg Gin His Asn Leu Gin 
1^25 1830 1835 1840 

dps Ala Gin Leu Asp Ala Val Gin Glu Glu Lys Met Tyr Pro Pro Lys 
p 1845 1850 1855 

Leu Asp Thr Glu Arg Glu Lys Leu Leu Leu Leu Lys Met Gin Met His 
1860 1865 1870 

Pro Ser Glu Ala Asn Lys Ser Arg Tyr Gin Ser Arg Lys Val Glu Asn 
1875 1880 1885 

Met Lys Ala Thr Val Val Asp Arg Leu Thr Ser Gly Ala Arg Leu Tyr 
1890 1895 1900 

Thr Gly Ala Asp Val Gly Arg He Pro Thr Tyr Ala Val Arg Tyr Pro 
1905 1910 1915 1920 

Arg Pro Val Tyr Ser Pro Thr Val He Glu Arg Phe Ser Ser Pro Asp 

1925 1930 1935 

Val Ala He Ala Ala Cys Asn Glu Tyr Leu Ser Arg Asn Tyr Pro Thr 
1940 1945 1950 
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Val Ala Ser Tyr Gin lie Thr Asp Glu Tyr Asp Ala Tyr Leu Asp Met 
1955 1960 1965 

Val Asp Gly Ser Asp Ser Cys Leu Asp Arg Ala Thr Phe Cys Pro Ala 
1970 1975 1980 

Lys Leu Arg Cys Tyr Pro Lys His His Ala Tyr His Gin Pro Thr Val 
1985 1990 1995 2000 

Arg Ser Ala Val Pro Ser Pro Phe Gin Asn Thr Leu Gin Asn Val Leu 

2005 2010 2015 

Ala Ala Ala Thr Lys Arg Asn Cys Asn Val Thr Gin Met Arg Glu Leu 
2020 2025 2030 

Pro Thr Met Asp Ser Ala Val Phe Asn Val Glu Cys Phe Lys Arg Tyr 
2035 2040 2045 

O 

A§a Cys Ser Gly Glu Tyr Trp Glu Glu Tyr Ala Lys Gin Pro lie Arg 

2 2050 2055 2060 

He Thr Thr Glu Asn lie Thr Thr Tyr Val Thr Lys Leu Lys Gly Pro 
2~Q65 2070 2075 2080 

A^s Ala Ala Ala Leu Phe Ala Lys Thr His Asn Leu Val Pro Leu Gin 
EH 2085 2090 2095 

(flu Val Pro Met Asp Arg Phe Thr Val Asp Met Lys Arg Asp Val Lys 
Nl 2100 2105 2110 

3 a / 
s 

Vll Thr Pro Gly Thr Lys His Thr Glu Glu Arg Pro Lys Val Gin Val 
O 2115 2120 2125 

He Gin Ala Ala Glu Pro Leu Ala Thr Ala Tyr Leu Cys Gly He His 
2130 2135 2140 

Arg Glu Leu Val Arg Arg Leu Asn Ala Val Leu Arg Pro Asn Val His 
2145 2150 2155 2160 

Thr Leu Phe Asp Met Ser Ala Glu Asp Phe Asp Ala He He Ala Ser 

2165 2170 2175 

His Phe His Pro Gly Asp Pro Val Leu Glu Thr Asp lie Ala Ser Phe 
2180 2185 2190 

Asp Lys Ser Gin Asp Asp Ser Leu Ala Leu Thr Gly Leu Met He Leu 
2195 2200 2205 

Glu Asp Leu Gly Val Asp Gin Tyr Leu Leu Asp Leu He Glu Ala Ala 
2210 2215 2220 
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Phe Gly Glu lie Ser Ser Cys His Leu Pro Thr Gly Thr Arg Phe Lys 
2225 2230 2235 2240 

Phe Gly Ala Met Met Lys Ser Gly Met Phe Leu Thr Leu Phe lie Asn 

2245 2250 2255 

Thr Val Leu Asn lie Thr lie Ala Ser Arg Val Leu Glu Gin Arg Leu 
2260 2265 2270 

Thr Asp Ser Ala Cys Ala Ala Phe lie Gly Asp Asp Asn lie Val His 
2275 2280 2285 

Gly Val lie Ser Asp Lys Leu Met Ala Glu Arg Cys Ala Ser Trp Val 
2290 2295 2300 

Asn Met Glu Val Lys lie lie Asp Ala Val Met Gly Glu Lys Pro Pro 
2305 2310 2315 2320 

T3S Phe Cys Gly Gly Phe lie Val Phe Asp Ser Val Thr Gin Thr Ala 
y3 2325 2330 2335 

C^i Arg Val Ser Aso Pro Leu Lys Arg Leu Phe Lys Leu Gly Lys Pro 
U 2340 ~ 2345 2350 

L@ Thr Ala Glu Asp Lys Gin Asp Glu Asp Arg Arg Arg Ala Leu Ser 
gi 2355 2360 2365 

A^f Glu Val Ser Lys Trp Phe Arg Thr Gly Leu Gly Ala Glu Leu Glu 
Cj 2370 2375 2380 

V|4 Ala Leu Thr Ser Arg Tyr Glu Val Glu Gly Cys Lys Ser lie Leu 

2^5 2390 2395 2400 

M 

ite Ala Met Thr Thr Leu Ala Arg Asp lie Lys Ala Phe Lys Lys Leu 

2405 2410 2415 

Arg Gly Pro Val lie His Leu Tyr Gly Gly Pro Arg Leu Val Arg 
2420 2425 . 2430 



(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1253 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 



# # 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 

Met Asn Tyr lie Pro Thr Gin Thr Phe Tyr Gly Arg Arg Trp Arg Pro 
15 10 15 

Arg Pro Ala Ala Arg Pro Trp Pro Leu Gin Ala Thr Pro Val Ala Pro 
20 25 30 

Val Val Pro Asp Phe Gin Ala Gin Gin Met Gin Gin Leu lie Ser Ala 
35 40 45 

Val Asn Ala Leu Thr Met Arg Gin Asn Ala lie Ala Pro Ala Arg Pro 
50 55 60 

Pro Lys Pro Lys Lys Lys Lys Thr Thr Lys Pro Lys Pro Lys Thr Gin 
65 70 75 80 

GPro Lys Lys lie Asn Gly Lys Thr Gin Gin Gin Lys Lys Lys Asp Lys 
m 85 90 95 

Sin Ala Asp Lys Lys Lys Lys Lys Pro Gly Lys Arg Glu Arg Met Cys 
jM. 100 105 110 

Q|Iet Lys lie Glu Asn Asp Cys lie Phe Glu Val Lys His Glu Gly Lys 
gj 115 120 125 

f^al Thr Gly Tyr Ala Cys Leu Val Gly Asp Lys Val Met Lys Pro Ala 

CI 130 135 140 

Sis Val Lys Gly Val lie Asp Asn Ala Asp Leu Ala Lys Leu Ala Phe 

^.45 150 155 160 

^Lys Lys Ser Ser Lys Tyr Asp Leu Glu Cys Ala Gin lie Pro Val His 

165 170 175 

Met Arg Ser Asp Ala Ser Lys Tyr Thr His Glu Lys Pro Glu Gly His 
180 185 190 

Tyr Asn Trp His His Gly Ala Val Gin Tyr Ser Gly Gly Arg Phe Thr 
195 200 205 

He Pro Thr Gly Ala Gly Lys Pro Gly Asp Ser Gly Arg Pro He Phe 
210 215 220 

Asp Asn Lys Gly Arg Val Val Ala He Val Leu Gly Gly Ala Asn Glu 
225 230 235 240 

Gly Ser Arg Thr Ala Leu Ser Val Val Thr Trp Asn Lys Asp Met Val 

245 250 255 

Thr Arg Val Thr Pro Glu Gly Ser Glu Glu Trp Ser Ala Pro Leu He 
260 265 270 
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Thr Ala Met Cys Val Leu Ala Asn Ala Thr Phe Pro Cys Phe Gin Pro 
275 280 285 

Pro Cys Val Pro Cys Cys Tyr Glu Asn Asn Ala Glu Ala Thr Leu Arg 
290 295 300 

Met Leu Glu Asp Asn Val Asp Arg Pro Gly Tyr Tyr Asp Leu Leu Gin 
305 310 315 320 

Ala Ala Leu Thr Cys Arg Asn Gly Thr Arg His Arg Arg Ser Val Ser 

325 330 335 

Gin His Phe Asn Val Tyr Lys Ala Thr Arg Pro Tyr lie Ala Tyr Cys 
340 345 350 

Ala Asp Cys Gly Ala Gly His Ser Cys His Ser Pro Val Ala lie Glu 
355 360 365 

J^ta Val Arg Ser Glu Ala Thr Asp Gly Met Leu Lys lie Gin Phe Ser 
^ 370 375 380 

so 

Ma Gin lie Gly lie Asp Lys Ser Asp Asn His Asp Tyr Thr Lys lie 
T§5 390 395 400 

jErg Tyr Ala Asp Gly His Ala lie Glu Asn Ala Val Arg Ser Ser Leu 
01 405 410 415 

Hps Val Ala Thr Ser Gly Asp Cys Phe Val His Gly Thr Met Gly His 
' S| 420 425 430 

She lie Leu Ala Lys Cys Pro Pro Gly Glu Phe Leu Gin Val Ser lie 
H 435 440 445 

Gin Asp Thr Arg Asn Ala Val Arg Ala Cys Arg lie Gin Tyr His His 
450 455 460 

Asp Pro Gin Pro Val Gly Arg Glu Lys Phe Thr lie Arg Pro His Tyr 
465 470 475 480 

Gly Lys Glu lie Pro Cys Thr Thr Tyr Gin Gin Thr Thr Ala Lys Thr 

485 490 495 

Val Glu Glu lie Asp Met His Met Pro Pro Asp Thr Pro Asp Arg Thr 
500 505 510 

Leu Leu Ser Gin Gin Ser Gly Asn Val Lys lie Thr Val Gly Gly Lys 
515 520 525 

Lys Val Lys Tyr Asn Cys Thr Cys Gly Thr Gly Asn Val Gly Thr Thr 
530 535 540 
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Asn Ser Asp Met Thr lie Asn Thr Cys Leu lie Glu Gin Cys His Val 
545 550 555 560 

Ser Val Thr Asp His Lys Lys Trp Gin Phe Asn Ser Pro Phe Val Pro 

565 570 575 

Arg Ala Asp Glu Pro Ala Arg Lys Gly Lys Val His lie Pro Phe Pro 
580 585 590 

Leu Asp Asn lie Thr Cys Arg Val Pro Met Ala Arg Glu Pro Thr Val 
595 600 605 

lie His Gly Lys Arg Glu Val Thr Leu His Leu His Pro Asp His Pro 
610 615 620 

Thr Leu Phe Ser Tyr Arg Thr Leu Gly Glu Asp Pro Gin Tyr His Glu 
625 630 635 640 

(£Pu Trp Val Thr Ala Ala Val Glu Arg Thr He Pro Val Pro Val Asp 
*0 645 650 655 

(ffly Met Glu Tyr His Trp Gly Asn Asn Asp Pro Val Arg Leu Trp Ser 
K 660 665 670 

Ljl 

G£En Leu Thr Thr Glu Gly Lys Pro His Gly Trp Pro His Gin He Val 
01 675 680 685 

S 

Tyr Tyr Tyr Gly Leu Tyr Pro Ala Ala Thr Val Ser Ala Val Val 
690 695 700 

u 

G^y Met Ser Leu Leu Ala Leu He Ser He Phe Ala Ser Cys Tyr Met 
7H5 710 -715 720 

lieu Val Ala Ala Arg Ser Lys Cys Leu Thr Pro Tyr Ala Leu Thr Pro 

725 .730 735 

Gly Ala Ala Val Pro Trp Thr Leu Gly He Leu Cys Cys Ala Pro Arg 
740 745 . 750 

Ala His Ala Ala Ser Val Ala Glu Thr Met Ala Tyr Leu Trp Asp Gin 
755 760 765 

Asn Gin Ala Leu Phe Trp Leu Glu Phe Ala Ala Pro Val Ala Cys He 
770 775 780 

Leu He He Thr Tyr Cys Leu Arg Asn Val Leu Cys Cys Cys Lys Ser 
785 790 795 800 

Leu Ser Phe Leu Val Leu Leu Ser Leu Gly Ala Thr Ala Arg Ala Tyr 

805 810 815 
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Glu His Ser Thr Val Met Pro Asn Val Val Gly Phe Pro Tyr Lys Ala 
820 825 830 

Mis lie Glu Arg Pro Gly Tyr Ser Pro Leu Thr Leu Gin Met Gin Val 
835 840 845 

Val Glu Thr Ser Leu Glu Pro Thr Leu Asn Leu Glu Tyr lie Thr Cys 
850 855 860 

Glu Tyr Lys Thr Val Val Pro Ser Pro Tyr Val Lys Cys Cys Gly Ala 
865 870 875 880 

Ser Glu Cys Ser Thr Lys Glu Lys Pro Asp Tyr Gin Cys Lys Val Tyr 

885 890 895 

Thr Gly Val Tyr Pro Phe Met Trp Gly Gly Ala Tyr Cys Phe Cys Asp 
900 905 910 

O 

Segr Glu Asn Thr Gin Leu Ser Glu Ala Tyr Val Asp Arg Ser Asp Val 
^ 915 920 925 

C@ Arg His Asp His Ala Ser Ala Tyr Lys Ala His Thr Ala Ser Leu 
[7 930 935 940 

Lg> Ala Lys Val Arg Val Met Tyr Gly Asn Val Asn Gin Thr Val Asp 
9*5 950 955 960 

re 

vSL Tyr Val Asn Gly Asp His Ala Val Thr He Gly Gly Thr Gin Phe 
M 965 970 975 

Ljl 

IE Phe Gly Pro Leu Ser Ser Ala Trp Thr Pro Phe Asp Asn Lys He 
980 985 990 



Val Val Tyr Lys Asp Glu Val Phe Asn Gin Asp Phe Pro Pro Tyr Gly 
995 1000 1005 

Ser Gly Gin Pro Gly Arg Phe Gly Asp He Gin Ser Arg Thr Val Glu 
1010 1015 1020 

Ser Asn Asp Leu Tyr Ala Asn Thr Ala Leu Lys Leu Ala Arg Pro Ser 
1025 1030 1035 1040 

Pro Gly Met Val His Val Pro Tyr Thr Gin Thr Pro Ser Gly Phe Lys 

1045 1050 1055 

Tyr Trp Leu Lys Glu Lys Gly Thr Ala Leu Asn Thr Lys Ala Pro Phe 
1060 1065 1070 

Gly Cys Gin He Lys Thr Asn Pro Val Arg Ala Met Asn Cys Ala Val 
1075 1080 1085 
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Gly Asn lie Pro Val Ser Met Asn Leu Pro Asp Ser Ala Phe Thr Arg 
1090 1095 1100 

•He Val Glu Ala Pro Thr He lie Asp Leu Thr Cys Thr Val Ala Thr 
1105 1110 1115 1120 

Cys Thr His Ser Ser Asp Phe Gly Gly Val Leu Thr Leu Thr Tyr Lys 

1125 1130 1135 

Thr Asn Lys Asn Gly Asp Cys Ser Val His Ser His Ser Asn Val Ala 
1140 1145 1150 

Thr Leu Gin Glu Ala Thr Ala Lys Val Lys Thr Ala Gly Lys Val Thr 
1155 1160 1165 

Leu His Phe Ser Thr Ala Ser Ala Ser Pro Ser Phe Val Val Ser Leu 
1170 1175 1180 

d|fs Ser Ala Arg Ala Thr Cys Ser Ala Ser Cys Glu Pro Pro Lys Asp 
fpUs 1190 1195 1200 

iHs He Val Pro Tyr Ala Ala Ser His Ser Asn Val Val Phe Pro Asp 
^ 1205 1210 1215 

lf|t Ser Gly Thr Ala Leu Ser Trp Val Gin Lys He Ser Gly Gly Leu 
y° 1220 1225 1230 

(By Ala Phe Ala He Gly Ala He Leu Val Leu Val Val Val Thr Cys 
M 1235 1240 1245 

H> 

]r|e Gly Leu Arg Arg 
n 1250 

y= 

12) INFORMATION FOR SEQ ID NO:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 115 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: RNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANT I -SENSE : NO 
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(ix) FEATURE: 

(A) NAME /KEY : - 

(B) LOCATION: 1..115 

(D) OTHER INFORMATION: /label= 26S_region 

/note= "26S promoter and transcription start and 
proximal downstream region of pSFVl; Figure 8." 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 1 . . 24 

(D) OTHER INFORMATION: /product= "26S promoter region" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 
ACCTCTACGG CGGTCCTAGA TTGGTGCGTT AATACACAGA ATCTGATTGG ATCCCGGGTA 60 
A^TAATTGAA TTACATCCCT ACGCAAACGT TTTACGGCCG CCGGTGGCGC CCGCG 115 
(*§) INFORMATION FOR SEQ ID NO: 5: 

*3 

O (i) SEQUENCE CHARACTERISTICS: 
M 1 (A) LENGTH: 127 base pairs 

1=4 (B) TYPE: nucleic acid 

O (C) STRANDEDNESS : single 

B\ (D) TOPOLOGY: linear 

Q (ii) MOLECULE TYPE: RNA (genomic) 

U (iii) HYPOTHETICAL: NO 

5 (iv) ANTI -SENSE: NO 

O 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 1 . . 127 

(D) OTHER INFORMATION: /label= 26S_region 

/note= "26S promoter and transcription start and 
proximal downstream region of pSFV2 ; Figure 8 . " 

(ix) FEATURE: . 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 1..24 

(D) OTHER INFORMATION: /product= "26S promoter region" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 
ACCTCTACGG CGGTCCTAGA TTGGTGCGTT AATACACAGA ATTCTGATTA TAGCGCACTA 60 



TTATATAGCA CCGGATCCCG GGTAATTAAT TGACGCAAAC GTTTTACGGC CGCCGGTGGC 120 
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GCCCGCG 127 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 123 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: RNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



- (ix) FEATURE: 

^ (A) NAME /KEY : - 

*f (B) LOCATION: 1. .123 

jj£ (D) OTHER INFORMATION: /label= 26S_region 

P /note= "26S promoter and transcription start and 

proximal downstream region of pSFV3 ; Figure 8 . " 

0 (ix) FEATURE: 

01 (A) NAME/KEY: misc_feature 
5 (B) LOCATION: 1 . . 24 

p (D) OTHER INFORMATION: /product= "26S promoter region" 

h (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

JaCCTCTACGG CGGTCCTAGA TTGGTGCGTT AATACACAGA ATTCTGATTA TAGCGCACTA 60 
TTATATAGCA CCATGGATCC CGGGTAATTA ATTGACGTTT TACGGCCGCC GGTGGCGCCC 120 
GCG 123 
(2) INFORMATION FOR SEQ ID NO : 7 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: RNA (genomic) 
(iii) HYPOTHETICAL: NO 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Semliki Forest Virus 
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(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 1 . . 54 

(D) OTHER INFORMATION: /label= restrict_site 

/note= "sequence of SFV E2 genome in vicinity of Bam HI site 
in SFV vector E2 ; Figure 12 . " 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1 . . 54 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: 

AAC TCA CCT TTC GTC CCG AGA GCC GAC GAA CCG GCT AGA AAA GGC AAA 48 
Asn Ser Pro Phe Val Pro Arg Ala Asp Glu Pro Ala Arg Lys Gly Lys 
15 10 15 

GJf CAT 54 
VgL His 

Q 

(W) INFORMATION FOR SEQ ID NO : 8 : 
\+ 

Q (i) SEQUENCE CHARACTERISTICS: 

01 (A) LENGTH: 18 amino acids 

? (B) TYPE: amino acid 

O (D) TOPOLOGY: linear 

Nl 

H (ii) MOLECULE TYPE: protein 

5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

Asn Ser Pro Phe Val Pro Arg Ala Asp Glu Pro Ala Arg Lys Gly Lys 
15 10 15 

Val His 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(iv) ANTI- SENSE: NO 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: HIV 

(ix) FEATURE: 

(A) NAME /KEY: - 

(B) LOCATION: 1..46 

(D) OTHER INFORMATION: /label= fragment 

/note= "HIV gpl20 epitope introduced into SFV 

vector E2; Figure 12." 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..45 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 



GAT CCG CGT ATC CAG AGA GGA CCA GGA AGA GCA TTT GTT GAG CTA 45 
Allp Pro Arg lie Gin Arg Gly Pro Gly Arg Ala Phe Val Glu Leu 
%. 5 10 15 

o 

Gr: 46 

r* 

Li. 

r — 

(©) INFORMATION FOR SEQ ID NO: 10: 

CP 

2 (i) SEQUENCE CHARACTERISTICS: 

0 (A) LENGTH: 15 amino acids 

SI (B) TYPE: amino acid 

H= (D) TOPOLOGY: linear 

O 

p (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: 

Asp Pro Arg lie Gin Arg Gly Pro Gly Arg Ala Phe Val Glu Leu 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 11: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 



(iii) HYPOTHETICAL: NO 
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(ix) FEATURE: 

(A) NAME /KEY: - 

(B) LOCATION: 1. .51 

(D) OTHER INFORMATION: /label= chimaeric_seq 

/note= "SFV-HIV chimaeric sequence shown in Figure 
12 . ". 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1. .51 

(D) OTHER INFORMATION: /product= "SFV-HIV chimaeric 
sequence 11 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



GAG GAT CCG CGT ATC CAG AGA GGA CCA GGA AGA GCA TTT GTT GAG GAT 48 
Gj.u Asp Pro Arg lie Gin Arg Gly Pro Gly Arg Ala Phe Val Glu Asp 

5 10 15 



S:g 

Pro 

ffe) INFORMATION FOR SEQ ID NO: 12: 

O (i) SEQUENCE CHARACTERISTICS: 
SI (A) LENGTH: 17 amino acids 

H> (B) TYPE: amino acid 

Q (D) TOPOLOGY: linear 

U (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: 

Glu Asp Pro Arg lie Gin Arg Gly Pro Gly Arg Ala Phe Val Glu Asp 
1 5 10 15 



51 



Pro 



(2) INFORMATION FOR SEQ ID NO:13: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(ii) MOLECULE TYPE: DNA (genomic) 



# 
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(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE : NO 

(ix) FEATURE: 

(A) NAME /KEY : - 

(B) LOCATION: 1 . . 60 

(D) OTHER INFORMATION: /label= oligonucleotide 

/note= "used to introduce new linker site" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
CGGCCAGTGA ATTCTGATTG GATCCCGGGT AATTAATTGA ATTACATCCC TACGCAAACG 60 

^2) INFORMATION FOR SEQ ID NO: 14: 

5 

(i) SEQUENCE CHARACTERISTICS: 
U (A) LENGTH: 62 base pairs 

^ (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



m 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



y, (ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 1..62 

(D) OTHER INFORMATION: /label= oligonucleotide 

/note= "used to introduce new linker site" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GCGCACTATT ATAGCACCGG CTCCCGGGTA ATTAATTGAC GCAAACGTTT TACGGCCGCC 60 
GG 62 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 62 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 1..62 

(D) OTHER INFORMATION: /label- oligonucleotide 

/note= "used to introduce new linker site" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

GCGCACTATT ATAGCACCAT GGATCCGGGT AATTAATTGA CGTTTTACGG CCGCCGGTGG 

Q 

:gG 

Sj2) INFORMATION FOR SEQ ID NO: 16: 

( i ) S EQUENCE CHARACTER I ST I CS : 
Z (A) LENGTH: 21 base pairs 

5 (B) TYPE: nucleic acid 

ff 1 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



P 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



(ix) FEATURE: 

(A) NAME /KEY : - 

(B) LOCATION: 1..21 

(D) OTHER INFORMATION: /label= primer 

/note= "SPl upstream sequencing primer" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
CGGCGGTCCT AGATTGGTGC G 
(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 



• 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANT I -SENSE : YES 

(ix) FEATURE: 

(A) NAME /KEY: - 

(B) LOCATION: 1..21 

(D) OTHER INFORMATION: /label= primer 

/note= "SP2 downstream sequencing primer" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

o 

©SCGGGCGCC ACCGGCGGCC G 21 

□) INFORMATION FOR SEQ ID NO: 18: 

fdb (i) SEQUENCE CHARACTERISTICS: 
p (A) LENGTH: 21 base pairs 

£f| (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
^ (D) TOPOLOGY: linear 

^ (ii) MOLECULE TYPE: DNA (genomic) 

2 (iii) HYPOTHETICAL: NO 

e : 

(iv) ANTI - SENSE : YES 



(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 1. .21 

(D) OTHER INFORMATION: /label= primer 

/note= "primer-l for first strand cDNA synthesis" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
TTTCTCGTAG TTCTCCTCGT C 21 
(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: -YES 

(ix) FEATURE: 

(A) NAME /KEY : - 

(B) LOCATION: 1..27 

(D) OTHER INFORMATION: /label= primer 

/note= "primer-2 for first strand cDNA synthesis" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
QSrATCCCAG TGGTTGTTCT CGTAATA on 

(*§) INFORMATION FOR SEQ ID NO: 20: 

D 

N= (i) SEQUENCE CHARACTERISTICS: 

H> (A) LENGTH: 28 base pairs 

p (B) TYPE: nucleic acid 

d (C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
g (iii) HYPOTHETICAL: NO 
Ihf (iv) ANTI -SENSE: NO 



(ix) FEATURE: 

(A) NAME /KEY: - 

(B) LOCATION: 1..28 

(D) OTHER INFORMATION: /label= primer 

/note= "5' most primer for second strand cDNA 
synthesis, equals bp 1-28 of SFV sequence" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 
ATGGCGGATG TGTGACATAC ACGACGCC 2 8 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS.- double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 1..46 

(D) OTHER INFORMATION: /label= adaptor 
/note= "5' -sticky end 

(EcoRI-Hindlll-Notl-Xmalll-Spel) blunt end-3' 
adaptor" 



— (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

M 

AtTTCAAGCT TGCGGCCGCA CTAGTGTTCG AACGCCGGCG TGATCA 

Q) INFORMATION FOR SEQ ID NO: 22: 

5 9 

\+ (i) SEQUENCE CHARACTERISTICS: 

p (A) LENGTH: 8 base pairs 

01 (B) TYPE: nucleic acid 

B (C) STRANDEDNESS: single 

p (D) TOPOLOGY: linear 

lT (ii) MOLECULE TYPE: DNA (genomic) 

^ (iii) HYPOTHETICAL: NO 

^* (iv) ANTI -SENSE: NO 



(ix) FEATURE: 

(A) NAME /KEY : - 

(B) LOCATION: 1..8 

(D) OTHER INFORMATION: /label- oligonucleotide 
/note= "Ncol oligonucleotide" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22 
GCCATGGC 

(2) INFORMATION FOR SEQ ID NO:23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL; NO 
(iv) ANTI -SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 1..20 

(D) OTHER INFORMATION: /label= oligonucleotide 

/note= "oligonucleotide used for screening by- 
colony hybridization" 

O 

*j (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
G§TGACACTA TAGCCATGGC 
(IW INFORMATION FOR SEQ ID NO: 24: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

S (ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 




(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 1. .24 

(D) OTHER INFORMATION: /label= oligonucleotide 

/note= "site-directed mutagenic oligonucleotide 
used to introduce a BamHI site into the SFV 
genome " 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 



GATCGGCCTA GGAGCCGAGA GCCC 
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(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 80 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: RNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Semliki Forest Virus 

_ (ix) FEATURE: 

O (A) NAME /KEY : - 

*0 (B) LOCATION: 1 . . 80 

(D) OTHER INFORMATION: /label- terminator 

/note= »3' terminal sequence of cDNA expression 
vector complementary to alphavirus genomic RNA" 



JJI (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

^TCCAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 60 

aaaaaaaaaa aaaaactagt 



80 



jjjj) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: RNA (genomic) 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Semliki Forest Virus 
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(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 1 . . 54 

(D) OTHER INFORMATION: /label= restrict_site 

/note= "sequence of SFV vector E2 in vicinity of Bam HI site 
Figure, 12 . " 

(ix) FEATURE: 

(A) NAME /KEY: mutation 

(B) LOCATION: 27 . . 32 

(D) OTHER INFORMATION: /label= restrict ion_sit 

/note= "BamHI recognition sequence introduced into 
SFV E2 genome in SFV vector E2 . " 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1 . . 54 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Mc TCA CCT TTC GTC CCG AGA GCC GAG GAT CCG GCT AGA AAA GGC AAA 48 
itin Ser Pro Phe Val Pro Arg Ala Glu Asp Pro Ala Arg Lys Gly Lys 
N=l • 5 10 15 

(ItC CAT 54 
Val His 

Q 
•~ - 

0) INFORMATION FOR SEQ ID NO: 27: 

b 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

Asn Ser Pro Phe Val Pro Arg Ala Glu Asp Pro Ala Arg Lys Gly Lys 
15 10 15 



Val His 



